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Ada  programs  can  be  regarded  as  ensembles  of  machines,  one  per  program 
(module),  vrtiich  in  turn  may  be  mapped  directly  into  corresponding  VLSI 
structures  on  one  or  more  chips  with  interconnecting  (packet  switched  or 
i  other)  communication  nets. 


Many  of  the  transformation  stops,  when  performed  manually,  when  optimization 
is  not  everywhere  crucial,  and  when  care  is  taken  to  constrain  somewhat  the 
structure  of  the  source  Ada  program,  appear  to  be  understood. 


The  research  reported  here  is  part  of  a  five-year  plan ,  the  lirst  year  of 
which  focuses  on  "proving"  the  concepts  through  a  realistic  demonstration  oi 
methodology  for  a  specific  example  Ada  program  (a  silicon  representation 
oart  or  all  of  the  DoD  Standard  Internet  Protocol,  IP,  initially 
Ada.)  Since  the  mapping  from  Ada  to  VLSI  is  seen  as  a  raultistep,  iterative 
procedure,  considerable  effort  for  the  following  four  and  a  half  years  will  be 
the  invested  in  the  development  and  tailoring  of  intermediate  languages  and 
their  bridging  algorithms  (compilers),  as  needed,  and  i"  the  ° 

objective  criteria  for  their  use  with  feedback  loops  for  iterative  desig  . 


Implicit  in  these  objectives  is  the  development  of  a  set  of  hardware 
structuring  paradigms  (rewrite  rules)  whose  application  c^  ensure  that 
transformation  steps  between  levels  of  abstraction  in  the  design  pr^ess  ^e 
well  structured  in  order  to  preserve  the  integrity  and,  where  possible,  the 
clarity  of  the  original  Ada  specification.  Some  paradigms,  but  of  course  not 
all,  lead  to  highly  efficient  implementations. 
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Abstract 

This  report  summarizes  the  second  six  months  of  work  of  the  coordinated  research  project, 
"Transformation  of  Ada  Programs  into  Silicon."  (The  main  objectives  of  this  project  were 
outlined  and  then  introduced  in  depth  in  the  preceding  semiannual  report.)  In  the  past  seven 
months,  work  has  advanced  in  three  main  areas.  Expanded  summaries  of  work  in  these  areeis 
(and  subareas)  are  presented: 

1.  W  ork  on  the  principal  case  study  of  this  project;  Converting  the  DoD  Internet 
Protocol  to  silicon.  The  full  Protocol  has  been  decomposed  into  three  main  parts. 

The  part  that  handles  outbound  datagrams  has  been  fully  specified  in  Ada  and 
an  interesting  part  of  that  code  has  been  transfermed  into  an  NMOS  circuit 
composite  represented  in  PPI,  (Path  Programmable  Logic). 

2.  A  tranformation  system  is  being  implemented  to  map  Ada  program  units  into 
intermediate  forms  in  sjTitactically  correct  Ada.  These  intermediate  forms  are 
suitable  for  input  to  the  transformation  system  (ASSASSIN)  that  automates  the 
production  of  the  async±ironous  control  components  of  the  PPL  circniit  compcjsites. 

A  theory  for  synthesiang  circuits  from  system  specifications  that  are  more 
abstract  than  Ada  is  also  reported. 

3.  Research  and  Development  on  the  design,  fabrication,  and  application  of  PPL 
(Path  Programmable  Logic)  circuit  arrays  is  reported 

a.  The  ASSASSIN  system  whic±i  transforms  state  graphs  of  state  mac±iines 
expressed  in  textual  form  to  self-timed  PPL  programs  and  composites  is 
operational . 

b.  Ccmpletion  of  a  PPL  simulator  (ASYLIM)  has  been  incorporated  into  the 
PPL  design  system. 

c.  Design  and  composite  layout  of  three  different  PPL  test  circuits  were  sent 
out  for  fabricaton.  The  circniits  will  be  used  to  check  a  wide  variety  of 
PPL  cells  and  supporting  drcuiti^u 

d.  A  design  technique  for  ICs  representing  self-timed  stored  state  machines 
and  date  path  components  using  the  PPL  cell  set  has  been  developed.  The 
results  of  the  research  have  produced  new  PPL  macro  cells  which 
augment  the  set  of  available  colls. 
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1.  Siuninaiy 

This  report  suminarizes  the  second  six  months  of  work  of  the  coordinated  research  project, 
"Transformation  of  Ada  Programs  into  Silicon."  Project  objectives  span  a  broad  and  ambitious 
spectrum  (broader  than  the  already  broad  title  implies),  hence  the  term  coordinated-,  this 
refers  tc  the  fact  that,  on  the  one  hand,  all  research  within  the  project  is  closely  related,  but 
that  the  werall  project  success  is  not  predicated  on  close  coupling  of  individual  subproject 
results.  The  main  objectives  of  this  project  were  outlined  and  then  introduced  in  depth  in  the 
preceding  semi-annual  report  [19],  They  are  repeated  here  in  more  brief  and  in  a  somewhat 
updated  form; 

1.  Develop  elements  of  a  tranformation  methodology  for  converting  Ada  programs 
or  their  parts,  into  VLSI  systems.  This  research  includes  identifying  a  sufficient 
set  of  transformation  rules  for  mapping  program  specifications  through 
successive  levels  of  representation,  from  Ada  or  related  abstract  specifications,  to 
integrated  cdrcuils. 

2.  Demonstrate  the  methodology  developed  in  1  by  manually  applying  it  to  a  non¬ 
trivial  example:  transforming  an  Ada-encoded  representation  of  the  DoD 
Standard  Internet  Protocol  [20]  (or  a  significant  subset  thereof)  into  NMOS 
circuitry. 

3.  W  ork  toward  a  theory  for  identifying  substructures  within  Ada  programs  for 
whicdi  the  transformation  methodology  is  pragmatically  attractive. 

4.  Develop  specifications  for  a  set  of  software  tools  for  use  in  automating  the 
transformation  methodology  developed  in  1. 

5.  Develop  a  methodology  for  testing  integrate  circuits  representing  Ada  program 
emits  and  for  integrating  such  circuits  into  a  larger  system. 

In  the  past  seven  months,  our  work  has  advanced  in  three  main  areas  and  in  several 
subareas  listed  below.  Expanded  summaries  of  work  in  these  areas  are  presented  in 
sucxzeeding  sections  of  this  report. 

1.  ork  on  the  principal  case  study  of  this  project;  Converting  the  DoD  Internet 
Protocol  to  silicon.  The  full  Protocol  has  been  decomposed  into  three  main 
parts  [18,  13],  The  part  that  handles  outbound  datagrams  has  been  fully 
specified  in  Ada  [14]  and  part  of  that  cxjde  has  been  transformed  into  an  NM  OS 
circuit  composite  [6], 

2.  Implementing  a  tranformation  system  to  map  Ada  program  units  into 
intermediate  forms  in  syntactically  correct  Ada.  These  intermediate  forms 
represent  < state  machine,  data  path>  pairs  suitable  for  input  tc  another 
transformation  system  that  automates  the  production  of  circuit  composites  [24]. 

a.  Development  of  a  theory  for  synthesizing  circuits  from  system 
specifications  that  are  .more  abstract  than  Ada,  e.g.,  axiomatic  algebraic 
specifications  or  from  Ada  augmented  with  ANNA -like  specifications  that 
also  allow  specification  of  temporal  properties.  [l2,  29,  2.5,  26] 

3.  Research  and  Development  on  the  design,  fabrication,  and  application  of  PPL 
(Path  Programmable  Logic)  edremit  arrays. 

a.  Completion  of  the  transformaticn  system  called  ASSASSIN,  reported  in 
detail  elsewhere  [7],  which  transforms  state  graphs  of  state  machines 
expressed  in  textual  form  to  self-timed  PPL  Programs  and  composites. 

b.  Design  and  cemposite  layout  of  three  different  PPL  test  circuits  called 
UU20,  UU21,  and  UU23.  UU20  is  used  to  check  the  read-enable  flip- 
flop,  the  write-enable  flip-flop,  the  asynchronous-dear  flip-flop,  row 
pass-transistors,  and  flip-flop  pull-up  cells,  UU21  checks  the  Set/Reset 
flip-flop,  the  .two— wire  latch,  the  inverter  cells,  the  column  pass- 
transistor,  and  the  S,  R,l,  and  0  cells.  UU23  checks  the  input  and  output 
pad  cells.  In  addition,  a  test  dreuit  containing  several  different  osdllators 
and  counters  has  been  induded  for  determining  performance. 

UU20  and  UU21  were  sent  to  M  OSIS  for  the  June  4  run,  and  in  July  we 
were  informed  that,  due  to  some  mask  problems,  none  of  the  dreuits  were 
completed.  W  e  are  still  waiting  for  these  parts.  In  September  we  dedded 
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to  process  all  three  test  circuits  in  our  own  (HEDCO)  laboratory. 
Problems  with  mask  making  equipment  have  caused  delays,  however, 
!  U  U  20  and  U  U  2 1  are  expected  out  of  the  process  line  in  late  N  ovember  or 

early  December.  111123  should  also  be  processed  in  December. 

c.  Completion  of  a  PPL  simulator  called  A5YLIM  which  has  been  under 
development  for  the  past  year.  (The  work  was  sponsored  primarily  by  a 
commercial  company.  The  simulator  was  incorporated  into  the  PPL 
design  system  for  use  in  this  project.  The  main  diaracteristics  of  this 

I  simulator  are  outlined  in  Section  4  of  this  report. 

d.  Development  a  design  technique  fcr  ICs  representing  self-timed  stewed 
state  machines  and  data  path  components  using  the  PPL  cell  set.  (The 
work  was  sponsored  by  a  private  company.)  These  techniques  have  been 
primarily  directed  at  the  design  of  dreuits  using  a  conventional  single¬ 
rail  Four  Cycle  signalling  protocol.  The  results  of  the  research  have 
produced  new  PPL  macro  cells  which  augment  the  set  of  available  cells 


5 


» 


9 


I 


9 


> 


Second  Semiannual  Technical  Report 


pages 


2.  Converting  the  DoD  Internet  Protocol  to  Silicon. 

by 

Elliottl.  Organick  and  Gary  Lindstroni 

As  mentioned  previously  [19],  our  design  of  the  Protocol  is  based  on  a  decomposibon  into 
three  submodules;  INM  _  OUT  dealing  with  traffic  outbound  on  a  given  local  net,  INM_  IN 
similarly  handling  inbound  traffic  and  INM  _  SRV  tying  them  together  and  interfacing  to  the 
Hcst(s).'  iV  e  envision  one  INM  _  IN  and  INM  _  OUT  pair  of  submodules  for  each  local  net 
interface,  but  only  one  INM  _  SRV  submodule  per  Internet  M  odule  (INM  ). 

W  e  are  following  the  fiveHevel  software  development  and  testing  plan  discussed  in  the 
preceding  report.  The  levels  correspond  to  IP  applications  in  increasingly  generalized  settings. 
The  plan  stipulates  testing  as  each  level  is  reached,  rather  than  as  an  epilog  to  the 
development  plan.  Testing  is  to  be  conducted  at  several  levels,  from  the  physiral 
diaracteristics  of  the  circuits  themselves  to  the  (Ada)  semantic  behavior  of  the  submodules 
that  have  been  converted  to  circuits. 

After  designing  (specifying)  the  interfaces  between  the  submodules  [13,  lO],  we  th^ 
selected  the  INM  _  OUT  (sub)module  as  the  first  one  to  be  converted  to  circuitry.  W  ork  toward 
this  objective  in  the  past  seven  months  has  been  rapid  in  some  respects  and  slow  in  others. 

The  specific  and  significant  accomplishments  have  been  as  follows: 

1  li’  e  have  coded  the  complete  INM  _  OUT  submodule  in  Ada  and  have  succeeded 

'  in  compiling  most  of  it  for  execution  on  the  Intel  iAPX  432  system  except  for 
statements  and  declarations  associated  with  uses  of  the  Ada  rendezvous 
construct. 

[As  later  versions  of  the  Intel  compiler  become  available,  w  e  expect  not  only  to  be 
able  to  compile  the  f\ill  module  using  rendez\’ous  syntax  and  semanUcs,  but  to 
execute  it  in  this  mode  as  well.  In  the  meantime  we  are  w orking  with  a  version 
of  the  code  that  simulates  each  rendezvous  via  Send/Reoeive  primitives 
iT^stantiated  through  use  of  the  Ada  generic  package  mechanism] 

2.  The  INM-  OUT  submodule  is  an  Ada  package  named  INM-  OUT-  Module;  it 
ccntair^s  three  intercommunicating  Ada  tasks.  IN  e  are  in  the  process  of 
transforming  each  ..f  these  tasks  into  PPL  circuit  composites  beginning  with  the 
second  one  listed  below; 

a.  The  main  task,  named  INM  -  OUT,  interfaces  with  INM  _  SRV  and  with 
LN  M  _  OUT  such  that  a  pipeline  effect  is  achieved  for  speeding  datagrams 
along  the  outbound  data  path:  Rest  module  >  INM_  SR\  > 

INM  -  OUT  — >  LNM  -  OUT. 

b.  An  auxiliary  (server)  task,  named  Read-  Init_  Parameters,  which  obtains 
from  host-related  memory  the  initial  parameter  values  needed  to  perform 
datagram  transmission.  Transformation  of  this  server  task,  one  which  is 
rich  in  Ada  control  structures,  is  essentially  completed  A  demonstration, 
showing  the  process  by  which  we  make  the  transformation  to  PPL  circuit 
compesitewas  given  inJune,  1982  during  a  DARPA  review  of  our  project. 

That  demonstration  was  based  on  a  preliminary  version  of  the  Ada  task, 
which  has  now  been  updated.  The  composite  produced  for  the  current 
version  of  the  task  is  more  interesting  and  is  apt  to  resemble  more  closely 
the  one  we  eventually  will  consider  the  final  version. 

c.  An  auxiliary  task  named  Translate- TO S- Task,  which  operates  in 
parallel  with  INM  -  OUT,  the  main  task,  by  translatimg  type-of-service 
infermaUon  from  host-level  to  local-net  level  encoding. 

3.  As  just  mentioned,  the  task  Read- Init_  Parameters  has  now  been  converted 
semi-eutomatically  to  PPL  circuit  composites  in  NMOS.  The  conversion  into 
PPL  composite  form  is  discussed  in  part  in  a  new  paper  by  Carter,  to  be  presented 
at  a  DARPA -sponsored  meeting  at  Stanford,  on  November  5  and  in  part  below. 
Carter’s  paper  focuses  primarily  cn  the  technology  for  converting  the  control 
structure  portion  of  the  Ada  task  into  the  self-timed  control  unit  of  the 
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corresponding  circuit 

^  this  report  we  make  some  observations  on  the  overall  structure  of 
Read-  Init_  Parameters  and  on  some  of  its  subtle  details.  W  e  also  comment  on 
some  of  the  steps  w  e  traversed  in  arriving  at  this  version  of  the  task.  A  copy  of 

^e  body  part  for  the  present  version  of  this  Ada  task  is  to  be  found  in  the 
A  ppendix. 

[i he  complete  Ada  specification  of  the  INM_  OUT  submodule,  which  includes 
this  task  is  giver,  in  a  separate  report  [14].  A  reader  of  the  .Appendix  version 
-nly  is  expected  to  imagine  how  the  task  Read-  IrJt-  Parameters  interfaces  with 
the  remainder  of  the  entire  submodule.  A  reader  of  the  separate  report  is  treated 
to  a  "road  map"  o.'  the  full  Ada  structure  of  the  INM_  OUT  submodule  which 
helps  to  understand  our  overall  design.] 

4.  As  a  prelude  to  testing  hardware  versons  of  Ada  pargrara  units  and  in  support  of 
our  work  in  specifying  subsystems  in  Ada  and  then  simulating  them,  we 
installed,  made  operational,  and  have  begun  using  a  complete  Intel  432  Cross 
Development  System.  This  system  indudes  an  Ada  cross  compiler  for  a  large 
subset  of  Ada  and  a  432  multiprocessor  system  consisting  of  two  regular  and  two 
interface  processors.  IV  e  expect  tc  receive  from  Intel  a  compiler  that  includes  full 
tasking  by  the  end  of  calendar  1982  and  an  equally  complete  resident  compiler 
approximately  a  year  later.  W  e  have  also  gained  hands-on  familiarity  with  a 
number  of  the  432  System's  operating  sj’stem  features. 


2.1.  Interesting  aspects  of  Read-  Tnit.  Parameters 
The  structure  of  Read— Init— Parameters  includes  a  number  of  typical  and  interesting 
features  of  Ada  tasks  both  from  the  point  of  view  of  inter-task  communication  and  intra-task 
body  structure 

-Inter-task  communication.  The  task  includes  nested  accept  statements  both  of 
whi^  have  both  in— bound  and  out— bound  parameters.  There  accept  statements 
are  implemented  using  simple  request/acknow ledge  protocols. 

-Intra-task  computation.  The  task  body  includes  a  rich  nested  loop  structure  and 
one  nested  block  defining  local  variables  whose  ranges  are  determined 
dimamically.  The  loops  include  the  infinite  outermost  loop  of  the  task,  familiar 
"for'  loops  with  fixed  upper  bounds,  and  indefinite  loops  escapes  from  which  are 
based  on  "exit  when"  clauses.  As  we  have  expected  all  along,  all  of  these  Ada 
control  structure  forms  map  in  a  straightforward  way  tc  ooTresponding  control 
structures  at  the  state  machine  level  and  thence  tc  PPL  circuits. 


•  data  path  of  Read-  Init-  Parameters  includes  several  variables  which  are  represented 
in  the  hardware  as  registers  or  counters.  One  array  variable  is  represented  as  a  RAM  to 
represent  a  map  from  type-cf-service  encoded  at  the  host  level  tc  type-of-ser\’ice  encoded  at 
the  local  net  level.  [.  he  size  of  this  R.AM  ,  which  is  never  apt  tc  be  very  large  in  any  case,  is 
limted  tc  four-octets  (for  a  2  by  2  array)  in  our  demonstration  implementation.  M  cst  of  the 
above  vanables  are  shared  with  the  other  two  tasks  of  the  submodule:  that  is,  they  are 
declared  local  tc  the  containing  package,  INM  —  0 UT— .M odule,  however  we  perceive  no 
difficulty  in  achie\-ing  mutually  exclusive  access. 

^  he  one  variable  that  is  local  tc  the  entire  server  task  does  net  and  is  net  represented  in 
hardware  as  a  storage  element.  I'ariables  used  locally  for  loop  control  are  represented  as 
hardware  counters  and/cr  registers,  but  some  sharing  is  achieved  where  there  is  nc  chance  for 
conflict. 

Although  the  transformation  to  the  Ada  code  to  the  "engine  level",  i.e.,  to  representation  as 
a  (control  unit,  data  path)  pair,  has  been  done  by  hand,  the  transformation  research  reported 
in  the  next  section  has  included  consideration  of  eadi  of  the  "hand“fnade”  mapping  steps  in 
this  part.icular  exercise. 
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2.2.  Arithmetic  processing 

That  we  have  encountered  so  little  trouble  performing  the  mapping  for  this  task  is  partially 
explained  by  the  fact  that  the  task  involves  only  trivial  arithmetic  processing.  (Indeed,  the 
entire  INM  _  OUT-  M  odule  involves  only  minor  arithmetic  processing.)  At  this  stage  of  our 
reseaich  we  are  glad  this  is  the  case  as  we  consider  it  important  to  determine  first  what  new 
challenges,  if  any,  must  be  met  for  achieving  asjmchronous  control. 


2.3.  On  going  eind  future  related  w ork 

N ow  that  this  part  of  the  research  is  essentially  complete,  including  the  development  of  the 
ideas  embodied  in  ASSASSIN,  we  expect  to  be  concentrating  next  on  such  challenges  as  the 
application  of  the  same  or  related  asynchronous  design  principles  to  arithmetic  processing. 
Also  included  in  our  agenda  is  reseai'ch  intended  to  help  us  automate  the  mapping  of  data 
path  storage  components,  identified  in  the  transformation  from  Ada  program  vmits,  into  PPL 
circuits  coupled  to  their  controls. 
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3.  A  Transformation  Systenr  Theory  and  Implementation 

by 

P.A .  Subrahamanyam 

Yf  e  have  made  substantial  progress  along  two  directions;  implementation  of  a  prototype 
transformation  system  and  further  development  of  a  conceptual /theoretical  basis  to  support 
the  design  cf  integrated  software^iardware  systems.  W  e  outline  the  major  contributions 
below,  with  appropriate  pointers  to  references  that  contain  mere  detailed  discussions. 


3.1.  Systems  Implementation 

-A  set  cf  tools  to  support  experimentation  with  Ada-to-Silicon  transformations  has 
been  implemented,  and  runs  on  the  TOPS-20.  The  system  has  been  ported  to  the 
VAX— 750,  and  an  initiad  version  has  been  installed.  This  porting  proved  to  be  a 
major  job  (and  problem)  due  to  unstated  incompatibilities  between  INTERLISP-20 
and  IKTERLISP— VAX.  Further  debugging  and  testing  of  the  Vax  version  will  be 
done  when  the  ex'peri mentation  is  moved  completely  over  tc  the  Vax.  (Given  the 
needed  personnel,  we  expect  this  to  be  carried  out  over  the  next  year,  when  our 
addi-ess  space  requirements  force  us  to  move  over  tc  the  V  ax). 

-An  initial  set  of  transformation  routines  has  been  implemented  and  is  being 
augmented  so  as  to  handle  additional  syntactic  constructs  in  Ada.  This  set  of 
programs  is  intended  to  aid  in  the  interactive  generation  of  the  target  hardware 
description  in  a  symbolic  representation.  Details  of  the  current  status  of  this  work 
are  reported  in  [24]. 


3.2.  Conceptual/Theoretical  Basis  for  Transformation 

—A  unified  theoretical  framework  to  support  a  broad  spectrum  of  the  VLSI  design 
process  has  been  introduced  in  [29],  which  is  currently  available  in  the  form  of  the 
draft  of  a  research  monograph.  This  monograph  introduces  an  algebraic 

framework  to  aid  in  the  synthesis  and  verification  of  special  purpose  VLSI 
systems,  proceeding  from  high  level  specifications.  It  allows  for  abstract 
specifications  of  the  syntax,  semantics,  temporal  and  performance  requirements 
particular  to  a  given  p,-oblem.  The  characteristics  of  the  environment  in  which  the 
system  is  embedded  can  also  be  specified  and  are  used  in  the  synthesis  process.  In 
addition,  the  framew-ork  allows  several  of  the  constructs  in  existing  languages  to 
be  modelled,  including  nondeterminism,  concurrency,  and  date /demand  driven 
evaluation.  This  allows  the  infrastructure  tc  be  (l)  applied  tc  situations  wherein 
the  problem  "specificatior."  is  in  the  form  cf  a  program  in  a  conventional  high  level 
language  and  (2)  used  to  model  the  lower  level  synclircnous /asynchronous  nature 
of  implementations.  Topology  and  circuit  layout  geometry  can  also  be  expressed 
by  using  the  algebraic  primitives  available. 

—Annotations  tc  Ada  have  been  proposed  to  aid  the  abstract  specification  of 
temporal  properties  of  systems  and  desired  performance  requirements  [25,  28,  12]. 

—  Transformation  methods  to  apply  the  theorj’  in  the  context  of  Ada  to  obtain 
systolic  implementations  are  detailed  [27,  24]. 

—A  n  algebraic  modelling  of  weak  conditions  to  be  met  by  asynchronous  circniits  has 
been  dene  —  the  resulting  model  is  very  simple,  and  the  caenditions  cacncise  and 
intuitive  [26]. 

Following  a  discussion  of  the  specification  and  synthesis  methods,  illustrations  are  given 
in  [29]  that  demonstrate  the  use  of  the  proposed  theoretical  basis  in  synthesizing  various 
classes  of  algorithms.  It  is  shown  how  (families  of)  systolic  algorithms  may  be  obtained  as  a 
spedal  case.  Methods  for  proving  the  correctness  of  implementation's  are  presented  and 
illustrated  with  examples.  The  concept  of  the  propagaticn  of  computational  lod  arises 
natunlly  in  course  of  the  development,  and  serves  to  generalize  the  commonly  used  notion  of  a 
"wavefront"  cf  computation  for  2-dimensional  architectures.  Automatable  design  aids  based 
on  the  proposed  algebraic  basis  are  delineated.  Finally,  it  is  shown  how  M  05  dreuits  can  be 


w  u  \ 


Second  Semiannual  Tecbxiical  Report 


page? 


modelled  using  the  primitives  available,  and  the  algebraic  derivation  of  Bryant’s  simulation 
algorithm  used  in  M  OSSIM  II  is  illustrated  in  this  context 


3.2.1.  Interface  IT  ith  Diana 

U  ost  of  our  transformation  tools  use  the  parse  tree  representation  of  a  program  as  the 
primary  data  structure  they  work  with.  W  e  have  in  mind  the  long  term  objective  of  being  able 
to  interface  with  the  tools  that  are  designed  to  operate  on  Ada  program  parse  trees,  and  that 
being  developed  by  the  Ada  community  at  large  (and  in  particular  the  DARPA  community). 
To  this  end,  we  have  been  interacting  (to  a  limited  extent)  with  the  Diana  group  (primarily  at 
Tartan  Laboratories). 


3.3.  Some  Remarks  on  System  Implementation  Issues 

W  hile  we  are  continuing  work  on  the  current  version  of  the  transformation  system  (in 
Interlisp,  and  on  the  Vax  and  DEC-20),  it  has  become  clear  that  there  are  two  major 
deficiencies  that  need  to  be  remedied  sooner  or  later.  These  are  (1)  unsuitability  of  the  current 
parse  tree  interface  (and  parser  generator)  for  several  of  the  transformation  routines 
themselves;  and  (2)  (lack  of)  speed:  this  is  due  to  the  slowness  of  Interlisp  on  the  Vax 
(compounded,  of  course,  by  the  fact  that  we  are  working  with  non— trivial  pieces  of  software). 

To  solve  the  first  problem,  it  is  necessary  to  redesign  the  parser  generator  (which  has  been 
imported  from  ISI  [31]).  However,  since  the  other  tools  (particularly  the  syntax  directed  editor 
generator  and  pattern  matching  system)  and  the  history  list  melanism  are  all  very  much 
inter— related  and  quite  deeply  ingrained  in  the  system,  there  is  a  substantial  software 
development  effort  involved  in  doing  this.  Currently,  we  have  neither  the  equipment  nor  the 
man-power  to  support  such  an  effort.  W  e  envision  the  redesign  being  more  profitably  done 
using  a  newer  generation  of  Lisp  (e.g.  PSL,  CommonLisp)  for  efficiencj'  reasons,  and  run  on 
personal  machines,  rather  than  on  a  Vax  like  machine.  In  the  interim,  however,  the  response 
of  the  extant  version  of  our  system  can  also  benefit  greatly  from  being  run  on  an  Interlisp — 
supporting  machine,  e.g.,  the  Dorado /Dolphin.  Having  access  to  such  systems  would 
obviously  result  in  greatly  improved  programmer  productivity. 


—  -K* 
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6  L  . 


Second  Semiannual  Teclmical  Report 


page  8 


4.  PPL  Design  Activities 


by 

Kent  F.  Smith,  Brent  Nelson,  Tony  Carter,  and  Alan  Hayes 


di?S  MdMttm'l.  '“ei'  “'■'“it  Isyopt.  staulatior..  elertriSl 

mecKing,  and  pattern  generator  tape  preparat  on.  It  includes-  fl)  symbolic  Invnnt  ♦ 

.  comT^on'.SKToX'S! 


4.1.  PPL  Design  Characteristics 
The  characteristics  of  design  using  the  PPL  methodology  include: 

1.  IC  design  is  performed  by  placing  small  circuit  modules  which  can  be  represented 
wi  .  logic  ^mbols  on  a  grid  representing  the  integrated  circuit.  IV  hen  the  grid 

avom^ifthp  f  representation  and  the  topological 

^  arcuit.  Effiaent  design  changes  can  be  made  as  a  result  of  this 
ngr  methodology  because  the  designer  has  simultaneous  perception  of  the 
circuit  function  and  the  circuit  topology. 

2.  ihe  arcuit  modul^  have  predefined  schematic  and  composite  representations 
They  are  custom  designed  to  optimize  performance  and  size  for  any  speX 
mtegrated  arcuit  process.  Design  Rule  Checking  <DRC)  is  performed  on  the 
module  and  thus  it  is  net  necessary  to  do  DRC  on  the  overall  circuit  since  it  is 
simply  a  collection  of  circuit  modules. 

designed  in  PPL  and  no  custom  design  is  required.  The 
^  An  'I'-terconnect  can  also  be  made  by  the  placement  of  PPL  cells  on  the 
gnd.  All  interconnections  between  modules  are  there  by  default.  The  designer 
only  places  breaks  to  remove  connections  rather  than  to  add  them.  ^ 

f  together  tc  perfonn  specified  functions.  These  macros  cells  can 

ha\  e  custem  physical  shapes  to  confenm  to  specific  space  requirements. 

5.  Simulation  ar.d  checking  .are  easily  accomplished.  eliminaUng  the  need  for  very 
^  time^cnsuming  operations.  The  only  elements  manipulated  are 

rectangles  which  must  be  checked  in  systems 

that  design  at  the  transistor  level. 


Analogy  Beriveen  the  PPL  Design  and  a  Computer  Program 
There  is  an  analogy  between  the  development  cf  the  PPL  design  nethodoloev  and 
programming  languages.  The  Is  and  O's  which  were  used  in  early  madhne  iSuSe 

analogous  to  the  rectangles  which  an;  used  in  the  custom  layout  of 
^  circuits.  Placing  transistors  on  a  composite  might  be  thought  of  as  being  analogous 
tc  wnUng  ^achine  language  code  in  hexideedma!  since  we  are  still  placing  rectangles^-  a 
'"^^f’^hand  ferm.  The  PPL  design  methodclogy  is  analogous  ic  writing  pryrTns'in 

(SS'Xis  pVl^  collections  of  trinsirfors 

wS  It  is  desinnp^rf^  design  methodology  is  still  very  dependent  upon  the  spedfic  technology 

depetdit  assembly  language  is  madii^ 


The  analogy  between  the  development  of  computer  programs  and  the  PPL  methodology  can 
be  earned  even  further  with  the  compilation  of  high  level  circuit  description  languages  to 
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integrated  circuit  layouts  (silicon  compilers).  The  high  level  descriptions  of  the  integrated 
circuit  are  machine  independent  and  are  compiled  directly  to  a  specific  PPL  cell  set  designed 
in  a  particular  technology.  To  date  there  have  been  cell  sets  done  in  NM  OS  [2l],  CM  OS  [22]  , 
and  I2L  [23].  An  example  of  such  a  silicon  compiler  is  A  SSA  SSIN  [7]  which  is  currently  in  use 
at  the  U  niversity  of  U  tah. 


4.3.  Design  Time  vs.  Integrated  Circuit  Area 

The  main  disadvantage  of  PPL  design  methodology  is  that  it  will  probably  result  in  circuits 
which  are  larger  than  completely  custom-designed  circuits.  Previous  work  done  by  the  VLSI 
group  at  the  U  niversity  of  U  tah  has  compared  some  custom  designs  to  some  PPL  designs.  This 
gives  insight  into  the  tradeoffs  which  exist  between  the  two  techniques.  A  circuit  known  as 
the  U  tah  Serial  CordicM  achine  (U  SCM )  was  designed  under  a  contract  with  Vr  right  Patterson 
A  FB  for  the  V  H  SIC  program  [3,  4,  5]  using  both  custom  design  techniques  and  the  PPL  D  esign 
Methodology.  The  USCM  was  constructed  using  an  implementation  similar  to  the  shift- 
register  scheme  proposed  by  V  older  [30]. 

The  USCM  was  implemented  using  a  CMOS  PPL  cell  set.  Its  design  time  and  chip  area 
were  compared  to  those  for  an  equivalent  custom  NMOS  design  done  at  Boeing  Aerospace 
Corp.  The  entire  CM  OS  PPL  chip  was  designed  and  simulated  in  approximately  eight  man 
days,  compared  to  approximately  eighty  man  days  for  the  NMOS  custom  design.  The  CMOS 
PPL  design  was  19  percent  larger  than  the  custom  NMOS  design.  W  hile  these  figures  may 
not  be  an  accurate  reflection  of  the  variables  which  enter  into  design  time  measurements,  they 
are  indicators  that  PPL  designs  require  significantly  less  design  time  than  do  equivalent 
custom  designs  and  result  in  chips  which  are  not  significantly  larger  in  area. 

This  favorable  reduction  in  design  time  can  be  attributed  to  several  factors:  (1)  The  designer 
has  concurrent  perception  of  logical  function  and  layout.  Thus,  he  can  immediately  see  when 
the  logic  function  being  implemented  does  not  fit  in  well  with  the  rest  of  the  circuit.  The  logic 
design  is  made  as  the  composite  is  drawn.  This  eliminates  the  need  for  separate  composite 
layout/logic  design  stages.  (2)  The  higher  level  symbolic  notation  allows  the  designer  to 
manipulate  very  complex  logical  elements  in  an  efficient  manner.  It  is,  for  example,  not 
necessar>’  to  trace  a  complex  series  of  logic  gates  to  determine  the  function  of  the  drcuit 
because  the  symbolic  notation  is  easily  read  and  interpreted.  In  addition,  the  symbolic 
notation  can  be  directly  simulated  and  does  not  require  the  extraction  of  the  transistor-level 
circuit  from  the  composite. 

Past  experience  would  indicate  that  the  area  penalty  incurred  by  the  PPL  design 
methodology  will  eventually  disappear  as  more  sophsticated  design  tools  are  developed.  This 
is  again  analogous  to  the  development  of  compilers.  It  is  well  known  that,  as  expertise  in 
compiler  writing  improved,  the  gap  between  hand-coded  and  compilei — produced  object  code 
size  became  negligible.  Some  of  the  techniques  being  developed  for  compaction  of  integrated 
circuit  layouts  will  be  used  to  close  the  current  gap  between  the  area  required  for  custom 
designs  and  automatically  generated  PPL  layouts. 


4.4.  The  Utah  PPL  Design  System 

In  addition  to  the  development  of  the  PPL  as  a  hardware  implementation  methodology 
described  above,  the  other  major  thrust  of  research  here  at  Utah  has  been  in  developing 
software  tools  for  PPL  design.  The  goals  of  this  software  research  have  included  the  following: 
(1)  Finding  ways  to  exploit  the  symbolic  nature  and  representation  of  a  PPL  design  to  reduce 
design  complexity.  (2)  Development  of  CAD  tods  around  conventional  computer  hardware, 
which  would  allow  designers  to  work  from  remote  workstations.  (3)  Creation  of  a  complete 
system  to  be  used  by  the  IC  design  community  here  at  Utah. 

An  integral  part  of  the  design  system  is  a  Computer  Vision  CADDS2/VLSI  Designer 
System.  It  is  used  to  do  the  composite  layout  of  the  individual  PPL  cells,  placement  of  the 
individual  cells  on  a  grid  to  form  a  circuit,  connecting  the  circuit  to  pads,  ading  scribe  lanes, 
and  generating  a  PG  tape.  Although  we  have  relied  heavily  on  this  machine  in  the  initial 
development  of  the  system,  in  its  absence  all  of  the  functions  it  performs  could  be  done  with 
other  tools  (the  Cal-Tech  Software  Package  for  example). 
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The  other  part  of  the  design  system  is  built  around  a  D  ECSystem-20.  A  silicon  compiler  for 
fimte  state  machines  (FSM),  a  symbolic  layout  system,  a  simulator  and  cell  placement 
pecker,  and  a  compaction  program  all  reside  there.  The  transfer  of  designs  between  the 
Computer  Vision  machine  (CV)  and  the  DECSystem— 20  is  done  using  a  mag  tape  written  in 
Computer  \  ision  External  Database  format.  The  combination  of  these  tvrc  computers  gives 
the  system  the  power  of  the  CV’s  IC  layout  features  combined  with  the  computing  power  of  a 
mainfreime. 


_  Each  PPL  cell  used  in  the  system  has  three  representations.  The  composites  of  the  cells  are 
oesigned  so  that  they  fit  together  by  virlure  of  their  being  placed  adjacent  to  each  other  on  the 
grid.  A  schematic  representation  of  each  cell  is  created  for  reference.  A  graphical 
representation  is  also  created  w hich  is  used  by  the  designer  as  he  uses  the  cells  to  form  larger 
drcuits.  ^ 


4.5.  Presently  Existing  Circuit  Layout  Tools 

The  placement  of  the  PPL  cells  on  the  grid  to  form  a  drcuil  can  be  done  using  either  the 
Computer  Vision  machine  or  one  of  several  programs  on  the  Utah  DECSyslem-20.  The 
program  used  for  cell  placement  on  the  pECSyslem-20  is  known  as  SLED  (Structured  Logic 
Editor)  [15]  .  In  SLED,  the  PPL  design  is  represented  as  an  array  of  cell  symbols  which  are 
then  edited.  Y  ith  the  SLED  editor,  a  simple  CRT  terminal  and  modem  is  eJl  that  is  needed  for 
drcuit  design  but  at  the  expense  of  more  cryptic  graphical  representations  of  the  individual 
PPL  cells  than  those  found  on  the  Computer  Vision  machine.  In  general,  the  ability  to  use 
SLED  from  a  remote  terminal  outweighs  this  limitation.  Advanced  editors  are  now  being 
designed  to  run  on  a  CRT  terminal  that  will  overcome  some  of  the  graphical  limitations  of 


SLED  was  designed  to  be  similar  to  a  screen-oriented  text  editor.  In  fact,  the  commands  in 
SLED  are  the  same  as  the  equivalent  commands  in  EM  ACS  [8],  a  popular  screen— oriented  text 
editor.  Cursor  movement  is  possible  in  any  of  the  four  directions,  and  regions  (windows)  can 
be  marked  and  then  named,  deleted,  replicated,  or  written  to  a  disk  file.  Conventional  text 
editors,  howe^’er,  only  allow  for  scrolling  and  windowing  in  the  vertical  direction  (lines  longer 
than  the  width  of  the  screen  are  wrapped  around).  In  SLED,  scrolling  and  windowing  are 
possible  in  both  directions.  Thus,  an  array  with  30C  columns  and  300  rows  can  be  displayed 
and  edited  using  SLED  without  screen  wrap-around.  The  effect  is  that  the  user  has  an  80X24 
window  which  can  be  moved  around  the  array. 

Circuit  layout  can  also  be  aaomplished  using  a  first-generation  silicon  compiler. 
Compilaton  of  Ada  language  modules  to  drcuits  is  accomplished  using  the  program  named 
ASSASSIN  [7].  This  program  takes  as  its  input  a  textual  description  of  the  operation  of  a 
control  unit  (Finite  State  Machine)  and  from  it  generates  a  PPL  lavout  implemenUng  the 
control  unit. 


4.6.  Circuit  Simulation  and  Electrical  Checking 

Simulation  of  the  PPL  design  is  essential  before  actual  fabrication  An  important  part  of 
the  design  system  is  a  simulator  (ASYLIM )  which  can  do  simulation  of  the  PPL.  Because  the 
PPL  cells  are  simulated  and  checked  individually  at  the  transient  level  when  the  cell  set  is 
designed,  the  complete  drcuit  made  up  of  PPL  cells  can  be  simulated  at  a  switch  or  gate  le^'el. 
ASYLIM  [16,  17]  reads  the  circuit  database  written  in  Computer  Vision  External  Database 
format.  Thus,  the  actual  design  can  be  simulated  rather  than  a  logic  equivalent. 

ASYLIM  is  similar  to  other  recently  developed  MOS  simulators  in  that  it  uses  a  m^itch 
model.  However,  the  development  of  a  simulator  for  PPL  has  shown  [17]  that  a  spedal 
purpose  simulator  was  required  in  order  to  preserv'e  the  user’s  abstract  view  of  the  drcuit. 
The  input  format  to  existing  simulators  is  typically  given  in  the  form  of  a  table  or  listing  of 
transistors  and  nodes.  To  preserve  the  user’s  abstract  view  of  the  drcuit  it  was  necessary  to 
design  a  simulator  for  PPL  where  the  elements  in  the  simulator  correspond  to  those  in  the 
PPL  cell  set  During  the  interactive  debugging  phase  of  the  simulation  of  a  drcuit,  the  user 
can  then  refer  to  circuit  elements  by  their  position  in  the  PPL  array.  An  added  feature  of  the 
PPL  simulator  is  that  the  information  stored  in  the  simulator’s  internal  representation  of  the 
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dreuit  interconnect  structure  can  be  used  for  additional  dreuit  checking  unique  to  the  PPL 
methodology.  The  end  result  is  that  ASYLIM  is  similar  to  conventioned  switch^evel 
simulators  but  ■with  an  extensive  user-interface  that  allows  the  user  to  work  with  the  dreuit 
at  the  symbolic  PPL  level,  the  same  level  he  uses  when  designing. 

ASYLIM  makes  use  of  six-^’alu'd  logic  and  uses  a  unit-delay  timing  model  [1,2].  The 
underlying  dreuit  model  primitives  are  switches  but  with  extensions  to  allow  for  the 
simulation  of  certain  entities  as  gates  (flip  flops  and  latches).  It  has  been  shown  that  the 
unit-delay  model  is  adequate  provided  the  circuit  is  free  from  races.  Thus  it  can  be  used  to 
model  the  sequence  of  dreuit  activity  [2]. 

An  additional  advantage  of  using  ASYLIM  over  other  simulators  is  that  it  contains  an 
extensive  interactive  circuit  debugger.  The  features  of  this  debugger  allow  the  user  to  view 
the  circuit  interconnect  structure  as  constructed  by  the  simulator.  This  is  displayed  in  a 
readable  format  that  allows  the  user  to  quickly  compare  the  simulator’s  interpretation  of  the 
dreuit  element  interconnections  and  the  intended  design.  This  comparison  uncovers  most 
design  errors  relatively  quickly.  In  addition,  the  simulator  performs  a  pre-simulation 
plausibility  check  on  the  dreuit’s  nodal  structure.  This  feature  (the  idea  borro'wed  from 
Bryant’s  M  OSSIM  [2]  enables  the  user  to  find  a  large  percentage  of  the  design  errors  without 
ever  going  to  the  expense  cf  an  actual  simulation.  This  check  identifies  nodes  with  fanout  but 
no  inputs,  inputs  but  no  fanout,  no  path  to  either  power  or  ground,  or  multiple  pullup  loads. 

W  hile  a  logic  or  switch— level  simulation  can  pro\’ide  an  invaluable  service  in  verifying  the 
logic  design,  there  are  many  features  of  a  design  that  do  not  show  up  in  a  simulation  run.  For 
example,  the  ground  node  may  be  spedfied  as  an  input  tc  a  transistor  in  a  diagram  but  it 
requires  an  explidt  check  on  the  layout  to  ensuie  that  ground  actually  has  been  routed  to  that 
device.  In  PPL  design,  these  types  of  electrical  (non-logic)  entities  are  included  in  the  design 
using  spedal  cells.  For  instance,  the  power  bussing  structure  is.induded  by  placing  power  and 
ground  buss  cells  around  the  dreuit  perimeter.  In  addition,  other  cells,  like  row  and  column 
loads,  are  usually  left  of  out  of  logic  diagrams  but  must  be  induded  for  the  dreuit’s  correct 
operation.  ASYLIM  cnecks  for  these  cells  as  a  part  of  its  operation. 


4.7.  Self  Timed  1C  Design  with  PPL's 

Another  activity  which  has  been  funded  by  a  private  company  and  is  cf  importance  in  the 
development  of  the  PPL  methodology  is  the  design  of  self-timed  modules  using  the  PPL  cell 
set.  The  work  is  based  on  techniques  developed  earlier  [9]  for  realizing  self-timed  stored  state 
sequential  circuits.  The  criginal  investigation's  were  applied  to  cff-the-shelf  SSI  parts.  The 
present  investigations  are  for  the  transfer  of  those  ideas  to  large  collections  (macros)  of  PPL 
cells  fer  use  in  the  design  of  self  timed  systems  to  be  contained  on  single  integrated  dreuits. 
The  investigations  have  led  to  further  development  of  the  PPL  cell  set  to  include  methods  for 
self  timed  dreuits  [l  l]. 

This  research  has  resulted  in  a  design  disdpline  for  self-timed  stored  state  machines  which 
has  been  develcped  using  a  conventional  single  rail  Four  Cycle  signalling  protocol.  (State 
descriptiens  are  encoded  in  PLA  s  represented  in  PPL.)  The  disdpline  differs  from  that  used  by 
Carter  [7]  which  uses  a  technique  known  as  a  "one  ixot"  scheme.  The  approach  used  for 
realizing  the  self  timed  stored  state  machines  is  based  on  twe  key  developments:  (1)  A  novel 
docking  circuit  that  generates  a  non-overlapping  two  phase  dock  cyde  fer  an  arbitrary  size 
register,  where  the  duration  of  the  phi  1  phase  of  the  cyde  is  autcmatically  adjusted  to  the 
register  size,  and  (2)  A  layout  discipline  for  the  folded  PLA  holding  the  state  table,  which 
guarantees  that  the  inputs  to  the  state  register  will  be  valid  at  the  time  that  the  dock  cyde 
cccurs. 

The  method  depends  on  certain  preperties  of  the  NM  0  S  PPL  cell  set,  i.e.  that  row  and  dock 
wires  are  polysilicon,  and  that  registers  are  formed  by  locating  flip-flop  cells  such  that  their 
dock  lines  are  serially  connected.  This  method  offers  a  designer  the  advantage  that  he  need 
not  concern  himself  with  the  hming  details  of  a  state  machine  design  in  order  to  assure  that  it 
will  work.  Assuming  that  the  state  table  realized  by  the  PLA  is  correct,  that  the  rows  and 
columns  of  the  design  are  properly  loaded,  and  that  the  proper  interconnections  have  been 
made  (all  of  which  can  be  verified  with  the  PPL  simulator  [17]),  the  designer  can  be  assured  of 
correct  operation  of  the  state  machine.  The  prindple  disadvantage  of  the  method  is  the 
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overhead  of  the  docking  drcuit  which  must  be  assodated  with  each  state  madiine. 

In  addition  to  the  self-timed  state  machine  design,  the  described  design  disdpline  [11]  has 
been  applied  to  several  interesting  types  of  self-timed  data-path  modules,  for  example  multi¬ 
bit  latches  and  ripple-carry  counters. 


4.8.  Futxire  CAD  Tools  for  the  PPL  Design  Methodology 

Our  operational  design  tools  should  be  enhanced.  The  following  agenda  lists  the  tools  we 
have  identified  as  being  an  important  part  of  a  design  system  for  this  methodology  and  which 
we  plan  to  develop: 

1'.  A  Relational  PPL  Database  Management  System  —  This  will  allow  the  same 
software  tools  such  as  the  editor  and  simulator  tc  be  used  on  PPL  designs  done 
using  any  specified  integrated  circuit  technology  such  as  NM  OS,  CMOS,  I2L,  and 
GaAs.  In  addition,  it  will  pro^'ide  a  standard  interface  between  the  various  CAD 
programs. 

2.  A  Symbolic,  Interactive,  PPL  Editor  —  this  editor  will  be  used  to  create  a 
symbolic  representation  of  a  PPL  circuit.  It  will  be  used  interactively  by  a 
designer  for  the  semi-automatic  placing  of  PPL  cells  on  the  PPL  grid.  Because  of 
the  symbolic  nature  of  PPL,  many  of  the  m'mdane  design  tasks  can  be 
automatically  performed  by  the  editor,  leaving  the  designer  free  to  concentrate 
on  logical  design.  The  editor  will  use  either  tablet  or  keyboard  entry  with 
simultaneous  graphical  representation  of  both  the  logic  description  and  the 
drcuit  topology. 

3.  Minimization  of  PPL  programs  —  De\’elopment  of  a  compaction  program  for 
compressing  a  PPL  design  by  rearranging  its  symbolic  description.  Such  a 
program  will  use  heuristically  driven  artifidal  intelligenoe  techniques  to  arrive 
at  a  nean-optimal  solution  to  the  minimization  problem.  Tills  tool  will  give  us 
the  capability  of  doing  loosely  packed  PPL  designs  which  can  then  be 
automatically  compressed.  This  is  a  unique  feature  of  the  PPL  design 
methodology  and  can  be  accomplished  because  of  the  symbolic  nature  of  the  PPL. 

4.  Predefined  Structured  Logic  Blocks  —  We  are  persuaded  that  drcuits  that 
already  contain  large  blocks  of  non-PPL  structured  logic  should  be  designed 
using  similar  techniques  to  these  presently  used  for  the  design  of  such  blocks. 

For  instance,  if  a  random  access  memory  (RAM )  is  required  in  a  drcuit,  it  is  more 
effident,  both  from  a  performance  as  well  as  a  topological  standpoint,  to  actually 
do  a  custom  layout  of  the  RA  M  .  The  PPL  cell  set  can  be  extended  to  indude  very 
elementary  cells  from  which,  macro  cells  car.  be  developed  for  any  spedfic 
implementation  of  a  RAM.  Components  generated  by  such  an  implementation, 
although  not  strictly  PPLs,  would  be  compatible  with  their  PPL  neighbors.  A  list 
of  of  structures  we  expect  to  implement  as  macros  indudes: 

nxm  ram 
nxm  rom 

n-blt  ripple  adder 
n  bit  fast  adder 
n-bit  priority  encoder 
n-bi t  reg i ster 
nxm  multiplier 
n-bit  comparator 
n-b i t  synch  counter 
n-bit  ripple  counter 
n-bit  by  m:  1  flUX 


4.9.  Observations 

Our  research  thus  far  has  demonstrated  the  usefulness  of  the  PPL  methodology  as  a  higher 
level  design  technique  for  hardware  analogous  to  the  use  of  assembly  language  for  computer 
programming.  The  analogy  has  been  extended  by  the  introduction  of  ASSASSIN,  a  first- 
generation  silicon  compiler  for  speed -independent  finite  state  machines. 
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Our  design  system  has  proven  useful  for  doing  actual  design  of  a  variety  of  integrated 
draiits.  It  has  reduced  design  times  required  by  an  order  of  magnitude.  Resultant  designs 
are  easily  simulated  and  corrected  due  to  their  symbolic  representation.  System  designers 
with  little  or  no  direct  experience  with  integrated  circuit  design  can  do  actual  IC  layout 
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6.  Appendix 


R d a - 1 0 -S I  I  I  CO n  Project 
Un I  vers i ty  of  Utah: 

DoD  Internet  Protocol  INIt.OUT  submodule 

fida  code  fo"  the  body  of  task  Re  a d_ 1 n i t _P ar ame t e r s 
Version  of  October  25,  I9£2 


separate  ( I nm_0u t _no d u  I  e  ) 


r: 


4a. 


task  body  Read_I n I t _Par ame t ers  is 
--  Recessed  globals: 


--  number_of_local_nel_types_of_service: 

--  local_net_type_of_service_table_roH_Eize: 
--  tos_table: 

—  Renamed  task  entry: 


octet _type 
oc  t  e  t_type 
octet_buffer_typa 


--  The  package  nemory_nodule  containing  the  task  hemory  holds 
--  to-be-sent  datagrams  as  uell  as  initialization  parameters 
--  needed  by  INH_0UT. 


pi ocedure  Hemory_requeEt  ( 
request_type_forma  I  : 

chunk_cf_address_formal  : 

octet_formal : 
renames  hemory. Request; 


memory _request_type; 

—  Load.address  or  r e ce  i  ve _da t um_o c t e t . 
chunk_of_address„type; 

—  Don't  care  when  r e que s t _ t yp e_ f o r ma I 
—  race  I ve_d a t um_oc t e t  . 
out  octet _type) 

--  Don’t  care  when  I o ad _ad dres s . 


Local  variable  declaration: 


--  The  fnllouing  variable  Is  commented  out.  It  appeared  only  in  the 
—  "high-level"  used  to  read  in  the  TOS  table.  See  belou. 

n umb er_o f _ 1 0 s _ t a b I e _o c t e t s :  integer  range  2  ..  ma x _ t os _t a b I e_s  i  z e  -  1 ; 
oc t a t _r e g i s t e r :  octet_type; 


begin 

loop 

accept  Go ( 

init_num_ formal: 


b  i  t  4  ; 


response: 

do 

response  := 


out  out.response) 
E  en  t  _ok ; 


--  For  Carter’s  paper 
--  only;  otherwise  bit3 


--  RIso  means  Init.ok. 


--  Get  from  the  server  all  of  the  addr_chunks  needed  to  form  the  base 
-~  address  in  memory  that  holds  the  initialization  parameters  and 
--  sends  these  chunks  to  the  hemory  module, 
for  index  in  1  . .  i n i t _num_f or ma I 
loop 

accept  Srv_req(  —  Get  next  address 

--  chunk  from  the 
--  Server  Itodule. 

server_command_datum:  srv_command; 

r i sp on s e _ t o_E er ve r :  out  ou t _r es p onse ) 

do 

ft  e  mo  r  y  ^r  e  q  u  e  E  t  (  --  Put  chunk  out  to  the 

—  ttemor  y  modu  I  e  . 
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request„typ«_<orliial 

chunk_o<_addrt*s_<or«ial 


octet^forma I 
end  Srv_reqi 
end  loop; 


=  >  I oa  d_addr  ass  , 

Convert  _sp  v_co  Bimand  _  t  o  _c  hunk  _o  t  _a  ddre  ss 
(server _c onirnand_datuBi) , 

=  >  dont_care_octet); 


--  Get  the  6  individual  initialization  parameters  (contained  in  the 
--  next  8  octets  received)  from  the  Memory  Module. 

for  Index  in  1  •  •  8 
loop 


Memory _request( 

request_type_torma  I  =>  race  I  v  e_da  t  um_o  c  t  e  t , 

chunk  o7  address_tormal  =>  d  on  t  _car  e  _X_da  t  urn, 
octetitormal  =>  oc  t  e  t  _reg  I  s  t  e  r  )  ; 


case  1  ndex 

is 

when  1 

=  > 

when  2 

=  > 

when  3 

=  > 

when  4 

=  > 

when  5 

=  > 

when  6 

=  > 

when  7 

=  > 

when  8 

=  > 

I  nm_max_packet .  I  o 
I  n  m_ma  x_pack  e  t .  h  1 
I n  m_ad  dress_l eng  t  h 
I nm_t I me_out .  I  o 
I nm_t I me_out  .h  I 
ack_type 

I  oca  I _ne  t_type_ot_serv  i 


:=  octet .register; 
:  =  oc  t  e  t  _reg  i  s  t er ; 
:=  octet .register; 
: =  oc  t e  t.r  eg  I  s  ter ; 
: =  oc  t  e  t.r  eg  I  s  ter; 
: =  oc  t  e  t .reg I s  t cr ; 

ce.t  ab I e.roH.s i ze 

!=  octet .register; 


number  ot  I  oca  1 _ne t .t y p e s _o t .se r v I ce 

:=octet. register; 


end  case; 
end  loop; 


--  Convert  the  local  net  timeout 
_ t ime.out.in.mi  I  1  i seconds  t  — 


intomilliseconds.? 

Inm.tlme.out  /  1888.8; 

_  Left-hand  side  variable  declared 

—  In  Inm.Out.Modu  I  e .  Value  Is  used 

_  later  In  Do. send  procedure. 

_  Note:  Davis  never  did  this  In 

_  hli  design.  Is  this  step  needed? 

_  No!  Ue  don’t  need  this  step 

_  since  the  quotient  can  be 

_  approximated  by  a  div  by  2eel8 

--  In  the  event  ue  need  to 
represent  mi  I  I  iseconds. 


--  Read  in  type  of  service  translation  table. 


The  folloHing  code  in  comments  is  replaced  beloH  by  a 
■' lower- 1  eve  I  "  version  that  closely  reflects  the  hardware 
implementation  chosen  in  which  we  eliminate  the  need  for 
tor  a  multiplier. 


nu mb er.of.tOE. table. octets 


local. net. type. of. sirv ice.tabi e.row.s  i  ze 
e  nuffib'er  .0  f  _  I  ocal.net. types.of. service; 


-  Check  to  see  if  required  table  size  exceeds  maximum 

If  number.of_tos.tab le.octets  >  max.t o s.t ab I e _s i z e  then 

response  :=  b a d.s r v.co mmand ; 
return; 
end  if; 


tor  index  in  1  ..  numb  e r _o f _. v s.t ab 1 e _o c t e t s 

I  oop 


Memory.reques  t  ( 

request.type.formal 
chunk .of.address.formal 
octet.formal 
end  loop; 


=  >  rece  i  ve.da  t  um.oc  t  e  t , 
=  >  d on  t  .car  e.X.da  t  urn , 

=  >  tos.tab le ( I ndex) ) ; 
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declare 

r ou _n umbe  r  : 

CO  I _n umbe r i 
index: 


Integer  range  8  ..  number_o f _ I  oca  I _ne t _t y peE_o f  service; 
Integer  range  8  .  . 

local _net_type_ot_servlce_table_rou_s Ire; 

Integer  range  8  .  . 

number^ot^loca I .ne  t^types_ot  service 
*  I ocaT_n  et_type_of_Eervlce_table_rou_si2e 
:  =  8;  ■ 


begin 

rou  number  :=  6; 

loop  "  Outer 

col. number  :=  8; 

loop  --  Inner 

flemory.request  ( 

request_type_<orma  I 
chunk_of_addresE_<ormal 
octe  t_forma I 


loop  reads  all  rows  of  TOS  table, 
loop  reads  In  one  rou  of  TOS  table 

=  ^  receive^datum. octet, 

=  >  do  n  t  _c  a r e_X_ca  t  um , 

=  >  t  os_t  ab I e ( I nde  X  )  )  ; 


col  number  :=  col_number  +  1; 

exit  when  CO  I  .number  =  I  oca  I  _ne  t  _ t  y pe  _o  f  .ser  v  I  ce  _t  ab  I  e_r ou_s  i r e  ; 


i ndex  :  =  I ndex  +  1 ; 

if  index  >  max.tos.t  ab  I  e_E  i  re  then 
response  :=  b a d .sr  v_co n.ma nd  ; 

return;  —  Ex  '  '  'I*®  current  accept 

end  if; 

end  loop;  —  End  inner  loop. 


statement. 


rou .number 
exit  when 
end  loop; 
end; 


:  =  rou.number  +  1; 

rou.number  =  number .o f _  I  o c a  I .n e t . t y pe s.o f .ser v I ce ; 

--  End  outer  loop. 

--  End  dec 'are  block. 


end  0  0 ; 
end  loop; 


—  End  of  init  processing. 

--  End  of  outei'-most  (inifinite) 
--  loop. 


end  Read. In  I t. Parameters; 
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Abstract 

Many  software  systems  exist  for  automatically  implementing  syndironous  state-machines. 
Presented  in  this  paper  is  a  software  system  —  ASSASSIN  —  for  the  design  and  automatic  layout  of 
self-timed  (or  speed-independent)  control-units  as  integrated  circuit  modules.  ASSASSIN  provides 
for  the  editing  of  textual  descriptions  of  control-flow,  the  functional  simulation  of  speed-independent 
control-units,  and  the  automatic  layout  of  the  implementation  as  a  Path-Programmable  Logic  (PPL) 
program.  Assassin  uses  a  well-known  technique  (a  one-^ot  stale  encoding)  for  implementation  of 
the  control-unit.  Examples  are  given  illustrating  the  specification  and  implementation  of  simple 
state-machines.  In  addition,  the  design  of  a  state-machine  of  interest  in  the  University  of  Utah’s 
A  da-to-Silioon  project  is  carried  out  A  portion  of  the  Ada*  code  for  the  "Output  Side"  of  the  Inter- 
Net-M  odule  (INM_  OUT),  which  will  eventually  be  fabricated  as  part  of  the  Ada-to-Silicon  Project,  is 
converted  by  hand  to  ASSASSIN  input  format  and  from  there  to  an  integrated  drcmlt  layout  by 
Assassin,  thus  illustrating  the  use  of  Assassin  in  the  context  of  the  A  da-to-Silicon  Projech 

I 

This  work  was  sponsored  in  part  by  the  Defense  Advanced  Research  Projects  Agency  (DARPA) 
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1.  Introduction 

The  development  of  CAD  tools  for  integrated  circuit  design  has  exploited  a  vast  body  of  knowledge 
about  synchrcncus  computing  systems.  Old  and  new  integrated  circuit  twdinologies  have  been  well- 
suited  for  implementing  synchronous  computing  systems.  The  success  of  these  synchronous  systems 
has  been  prodigious  as  witnessed  by  the  recent  booms  in  the  manufacturing  and  purchasing  of  com¬ 
puting  systems.  Current  research  in  semiconductor  devices  is  rapidly  heading  toward  the  ability  to 

J 

construct  computing  systems  which  operate  orders  of  magnitude  faster  and  which  are  far  more  com¬ 
plex  than  those  currently  available.  ASSASSIN  treats  part  of  problem  of  designing  self-timed  sys¬ 
tems. 

W  ith  projected  room-temperature  speeds  of  logic  devices  ranging  down  to  tens  of  picoseconds  of 
delay  time  [3],  it  appears  that  the  postulate  advanced  by  Seitz  in  Chapter  7  of  Introduction  to  VLSI 
Systems  [7]  will  be  borne  out.  The  contention  is  that  the  current  methods  of  system  synchronization 
(global  cdocks)  will  result  in  unreliable  drcuits  as  devicse  speeds  increase  and  as  devicse  switching 
energies  decrease. 

If  Seitz  is  indeed  right,  the  newer  and  faster  integrated  circuit  technologies  will  require  computing 
systeir>s  to  be  implemented  using  something  like  "Self— Timed"  or  "Speed-Independent"  login  In 
these  types  of  logic,.only  sequence  is  of  concern.  The  actual  gate  and  wiring  delays  will  not  affecA  the 
function,  only  the  absolute  speed.  It  should  be  noted  that  any  asynchronous  device  requires  that  the 


'Ada  it  a  regi tiered  trademark  of  the  U.S.  Covemment,  A  da  Joint  Program  Office. 


Assassin 


surrounding  environment  to  be  suitably  oondiUoned  so  as  to  tolerate  the  "un-synchronlzed"  acUons 
of  the  device. 


M  uch  work  has  been  done  In  the  Implementation  of  synchronous  structures  In  integrated  circuit*. 
CorapuUng  systems  can  be  divided  into  two  main  parts:  control  and  data-path.  UnlversiUes  and 
mdi^tiy  alike  have  produced  many  methods  for  generaUng  synchronous  system  control,  some  using 
me  FLA.  W  ork  has  and  Is  being  done  in  the  automatic  generaUon  of  synchronous  data-paths [9] 
W  hilc  there  have  been  some  successful  efforts  to  construct  self-timed  or  speed-independent  oomput- 
systems  such  as  DDH  1  [2]  and  ILLIA  C  II  [8],  there  has  been  very  little  work  done  on  the  im- 
plementaUon  of  self-timed  compuUng  systems  in  integrated  circuits.  This  may  be  because  there 
were  few  integrated  circuit  Implementation  strategies  which  readily  lent  themselves  to  the  construe- 


of  Path-Programmable  Logic[l]  (PPL),  a  derivaUve  of  the  Storage/Logic  Array 
(SLA)  [lOJ.  has  proven  to  be  of  great  value  in  the  generaUon  of  self-timed  control  in  Integrated 


Assassin  is  part  of  a  research  effort,  being  pursued  at  the  University  of  Utah,  to  convert  Ada 
programs  into  integrated  circuit  implementaUons.  ASSASSIN  transforms  the  control  portions  of  Ada 
programs  into  their  corresponding  integrated  circuit  counterparts.  In  addiUon.  ASSASSIN  It] 
provides  a  software  tool  for  the  .-jpedfication.  simulation  and  compilation  of  self-timed  control-units 
to  integrated  arcuit  module  layouts.  As  sudi.  it  begins  to  treat  some  of  the  low-level  problems  of 
self^med  systems  design.  It  uses  PPL  as  the  integrated  circuit  implementaUon  strat^y  and  a 
one^ot  encoding  of  the  control  states  [4]  as  a  mapping  from  the  spedfication  to  the  circuit 
implementation.  It  allows  an  implementation  independent  spedfication  of  control  (that  is.  inde¬ 
pendent  of  fabricaUon  technologies  and  drcuit  implemeniaUon  techniques),  and  provides  funcUonal' 
simulation  capabiliUes.  Layout  generaUon  (analogous  to  the  software  compiler  code  generaUon) 
results  in  self-timed  drcults  which  funcUonally  match  the  results  of  simulaUon.  ASSASSIN  also 

provides  a  single,  convenient  user  interface  for  all  of  its  funcUons.  ' 


2,  The  Specification  of  Control;  Syntax 

The  spedficaUon  of  control  for  a  given  drcuit  can  result  in  a  labelled,  directed  graph  similar  to  the 
one  in  figure  2-1.  There  are  named  nodes  whidi  are  called  states  and  labelled  directed  arcs  whidi 
are  called  transitions.  Assodated  with  states  are  operations  on  output  variables.  These  dperaUons 
may  be  functions  of  only  the  state,  or  they  may  be  funcUons  of  the  state  and  a  boolean  funcUon  of  a 
set  of  input  variables.  Transitions  are  labelled  with  a  boolean  funcUon  of  membere  of  the  set  of  input 
vanables  which  dictates  the  condiUon  upon  which  that  transiUOn  will  take  place.  TransiUons  may 
also  have  assodated  operaUons  on  outputs  (M  ealy  M  adiines). 

The  ability  to  spedfy  strictly  sequential  control  is  certainly  essenUal.  A  Ithough  our  current  un¬ 
derstanding  of  concurrent  processing  is  very  limited,  the  ability  to  handle  concurrent  paths  of  control 
may  also  prove  to  be  useful  as  our  understanding  increases.  Concurrency  (in  the  context  of  control) 
can  be  interpreted  in  two  ways.  The  first  is  where  two  separate  machines  operate  IndependenUy 
communicaUng  via  some  signalling  protocol.  The  second  is  where  a  single  machine  performs  some 
types  of  concurrent  processing  by  having  ooncurrenUy  executing  control  paths.  The  first  is  handled 
oy  having  control-units  composed  of  multiple  state-machines.  In  terms  of  graphs,  this  Implies  that 
one  can  draw  many  separate  graphs,  whose  interconnecUon  is  implied  by  output  and  input  variable 
names.  The  second  is  handled  by  allowing,  within  a  single  state-machine,  some  noUon  of  forking  to 
begin  ooncurrenUy  executing  control  paths  and  a  noUon  of  joining  to  terminate  ooncurrenUy  execut¬ 
ing  control  paths.  The  addiUon  of  the  concepts  of  FORK  and  JOIN  to  the  graph  model  of  control-flow 
is  illustrated  in  figure  2-2. 

Output  generation  from  a  control-unit  can  be  either  enduring  or  ephemeral.  Enduring  outputs 
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Figures— 1:  A  Simple  Control-Flow  Graph 

are  latdied  and  operated  on  by  SET  and  RESET  only.  W  hen  an  enduring  output  is  SET  it  will 
remain  on  until  a  RESET  operation  is  performed.  Ephemeral  outputs  are  gated  and  remain  on  only 
while  the  required  condition  is  met  (either  residence  in  a  state  or  execution  of  a  transition).  They  are 
operated  on  by  H OLD . 

Figure  2-3  contains  a  control-flow  graph  which  contains  all  of  the  features  included  in  the  discajs- 
sion  above.  States  are  represented  by  rectangle*  with  the  name  of  the  sUte  indicated  in  the  upper 
left  comer,  followed  by  a  colon.  Output  generation  is  Indicated  by  a  right-arrow.  To  the  left  of  the 
right-arrow  will  be  a  boolean  expression  and  to  the  right  the  operations  to  be  performed  and  the 
names  of  the  outputs  which  are  to  be  operated  on.  For  example,  State  B  contains  three  output 
operations.  The  first  is  unconditional  (it  depends  only  on  the  state  of  the  machine)  and  mnwg  the 
ephemeral  output  "0 1"  to  be  held  true.  The  second  is  conditional  (the  boolea.n  expression  Is  'T3'')  and 
causes  the  enduring  output  "03"  to  be  SET.  The  third  is  also  conditional  (the  boolean  expression  is 
"14  OR  15")  and  causes  the  ephemeral  outputs  "02"  and  "05"  to  be  held  true  and  the  enduring  output 
"04"  to  be  RESET. 

A  Iso  required  in  the  spedficaUon  of  control  is  the  concept  of  an  initial  state.  In  the  graphs,  this  is 
indicated  by  the  arc  labelled  li  asterReset  which  has  no  state  node  at  its  tail. 

In  summary,  the  specification  language  for  control  should  indude  the  following  features: 

—the  concept  of  an  initial  state, 

—simple  transitions  from  one  state  to  another  (H  OVE), 

—transitions  from  one  state  to  many  states  (FORK), 

—transitions  from  many  states  to  one  state  (JOIN), 

-outputs  controlled  only  by  residence  in  a  state  or  by  the  execuUon  of  a  transiUon, 

-outputs  controlled  by  a  boolean  combination  of  inputs  AND  by  residence  in  a  state  or  by 
the  execution  of  a  transition. 
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Figure  2-3:  A  Complex  Control-Flow  Graph 
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—arbitrarily  cximplex  boolean  expressions  for  conditions  (controlling  transitions  and  output 
generation), 

— 1  ambda  transi ti ons  (w here  the  oondi ti on  is  the  tautol ogy  TRUE), 

—ephemeral  outputs, 

—enduring  outputs, 

—multiple  and  varied  transitions  from  a  given  state, 

—multiple  and  varied  transitions  to  a  given  state,  and 
—multiple  state-machine  control-units. 

The  task  now  is  to  codify  the  points  listed  above,  such  as  in  a  grammar  in  BNF.  It  must  allow  tar 
all  the  points  listed  above  while  limiting  its  expressive  pmver  to  those  points.  The  language  must  be 
easily  parsed  and  it  is  desirable  that  parser  generators  be  used  to  generate  the  code  for  the  parser. 
A  bove  all,  the  language  should  be  concise  and  intelligible  to  design  engineers. 

The  complete  BNF  for  the  language  (which  is  called  CUDL)  is  included  in  Appendix  I.  The  lan¬ 
guage  has  the  ability  to  represent  each  of  the  points  listed  above.  There  are  four  types  of  blodcs  in 
the  language.  The  first  is  the  CONTROLUNIT  blodc.  This  block  indicates  the  name  of  the  overall 
control-unit  and  contains  STATEIIACHINE  blocks.  It  also  indudes  the  spedfication  of  "global”  input 
expressions  which  assign  boolean  expressions  to  an  internal  variable  which  can  significantly  reduce 
the  size  of  the  code  written  to  describe  the  control-unit.  The  names  of  "global"  inputs  can  be  used  in 
the  descriptions  of  transitions  and  output  generation.  Figure  2-4  contains  the  CUDL  code  describing 
the  machine  whose  graph  is  in  figure  2-Q. 

controlunit  Coap i laTactS: 

inputc:  BIG  is  II  and  (12  or  not  13); 

taiftiaad  itataaaehina  TaitSi 
atartatata  Ri 

forkon  BIG  to  B,C; 
aovton  NOT  BIG  to  0; 
hold  01,02; 
raaat  03; 
aat  04; 
and; 

atata  B< 

Joina  C  on  14  RNO  IS  to  F; 
joina  E  on  14  OR  IS  to  F; 
hold  01; 

if  13  than  aat  03; 

if  14  OR  IS  than  bagin  raaat  04;  hoid  02, OS;  and; 
and; 

atata  Ci 

aovaon  NOT  16  to  E; 

joina  B  on  16  to  F  doing  bagin  raaat  03; 

If  BIG  than  aat  04;  and; 

hoid  01; 
and; 

atata  Oi 

Movaon  17  to  F  doing  aat  03; 
and; 

atata  El 

Joina  B  on  TRUE  to  F; 
and; 

atata  Fi 

Movaon  18  to  R; 

Movaon  NOT  18  to  D; 
and; 
and; 
and. 

Figure  2-4:  CUDL  Code  for  the  Graph  in  Figure  2-3 
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Eventually,  given  an  appropriate  display  device,  a  graphical  version  of  this  co^d  be 

developed  and  the  specinoatlon  of  control  could  be  done  in  terms  of 

textual  description  of  the  graph.  A  project  is  underway  to  implement  such  a  front  end  to  AS 
on  an  A  polio  D  0  M  A  IN  computer. 


a.TheSimulaUonof  Control;  Semantics  ^  , 

Given  that  the  syntax  of  control^t  specificaUon  is  defined,  the  designer  must  ^so 
the  semanUcs  in  order  to  use  the  system.  The  semantics  of  control  is  direcOy  influenced  by  the 
implementation  strategy  selected.  Since  the  specification  of  control  should  allw  for  concurren^ 
within  a  given  state-machine,  a  scheme  which  allows  the  implemehtation  of  such  concurrency  m 
be  selected.  The  notion  of  concurrency  eliminates  the  possibility  of  completely  and  unJ^quely  encod¬ 
ing  the  state  variables.  The  one-hot  implementation  scheme  (completely  decoded)  allows  for  ewy 
implementation  of  concurrency.  The  following  discussion  is  largely  based  on  the  assumption  that  a 

one-hot  implementation  is  used.  ,  .  ,  ti,« 

The  specificaUon  syntax  described  in  the  previous  secUon  can  be  interpreted  in  thr^  ^®ys- 
interpretaUon  depends  on  the  parUcular  mapping  strategy  being  used  in  the  cpmpilaUon  The  thn« 
posslbls  type,  of  napping  sre  synchronous,  ssynchronous.  snd  selMlmcd.  f 

W  intsrpretslions  to  be  e-rentuelly  simuleted  end  compiled,  the  langnege  Indude  the  ^“1*  ^  • 
stel«o.cblne  type.  The  dioloe  of  .  slete-mechlne  level  sementio  Interpretotion  e  ™d.  expl  dt 
through  the  use  of  the  keywords;  SELFTIUEC.  ASTNCHIIOIIOUS.  end  SVHCBRONOUSjnJhW  wey.  the. 
Ser  L  spedfy  verlous  type  of  control  using  the  seine  system.  Only  .the  SELFTIUEb  opuen  is 

cen  be  fundlonbl  in  neture.  Ibis  funcUond  dmuleUm.: 

provides  knowledge  about  the  sequential  funciion  of  the  circruit  Since  the  ^ 

drcuit  is  such  that  if  sequence  is  correct.  Junction  is  correct,  the  user  is  sure  that  the  arouit  will 
work  if  the  environment  in  which  he  places  it  is  condiUoned  to  interact  in  a  self-Umed  manner  with 

the  control-unit.  .  ei.  »  ■ 

The  simulation  of  synchronous  and  asynchronous  control  really  reqmres  the  use  of  a  detailed 

Lulelor.  TbS  slmul.lor  must  be  eble  lo  meke  ecairele  delay  mlculetion,  <>- 

able  gate  delays.  In  the  world  of  the  integrated  circuit,  these  delays  may  or  may  not  be  ea,  ^y 
calculated  since  long  wires  and  heavy  loads  will  significanUy  alter  the  ° 

Thus,  the  problem  of  simulation  for  these  types  of  systems  is  much  more  difficult  that  for  the  self 

^To'^i^tfr^iret  the  semanUc  actions  of  the  control-unit,  one  must  know  first  the  acUons  to  be  ti^en 
to  execute  a  transition  and  second  how  outputs  are  generated.  Transitions  are  ^ 

change  the  internal  state  of  the  machine.  Although  there  may  be  many  transitions 
leaving  a  given  state,  it  should  never  be  possible  to  execute  two  transitions  con^ently  from 

state  Since  the  control-unit  has  no  control  over  the  sequence  of  arrival  and  the  Uming  erf  the 
inputs  that  trigger  transitions,  the  problem  of  having  two  transitions  executed 
iiAerently  a  dynamic  one  and  its  avoidance  requires  a  detailed  knowledge  of  the  environment  into 
which  the  control-unit  is  to  be  placed.  If  two  transitiorj!  were  executed  simultaneously,  the  r^t 
would  be  a  state-machine  which  would  be  in  two  sequenUal  and  mutually  exclusive  states  at  e 

“TLTree  interpretations  of  control  have  somewhat  different  vi^s  of 

implementaUon  uses  transiUons  that  are  essentially  handshakes  between  logically 

This  characterisUc  can  be  portrayed  by  a  -token-passing-machihe".  with  ^  ^ 

controlled  splitting  and  recombinatiori  of  tokens  (FORK  and  JOIN).  In  a  transitimi 

and  state  B.  state  A  will  first  set  state  B  and  then  state  B  will  reset  state  A.  Consider  the  case  (figure 
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Figure  3-1:  Handshaking  States 
3-1)  where  a  madiine  contains  four  sequential  states  a  b  a 

state  c  to  state  D  can  occur  immediately  after  stale  C  is  set^fTtTt  c  transiUon  from 

Zbll'“to  thf^el  W-  “"T  .11  <».ter  .mind  ,h.t  to  d.  .bout  Ud, 

p3r»rrt:rrrro„““ 

Looking  from  inside  the  control-unit,  there  are  two  types  of  outputs  The  first  is  tho  i 

“PP"’'”’.!'  “idiUon  Is  met.  The  second  is  the 

el  IS  maintained  even  after  the  appropriate  condition  has  disappeared.  It  is  possible  howev«.  t„ 
place  a  latdied  output  in  a  metastable  condiUon  by  tiying  to  set  or  reset  it  at  the  same  tim 
care  must  be  taken  in  working  with  latched  outputs 

^rom  a  control-unit  is  always  conditional  upon  something  W  hat  we 
term  as  an  unconditional  output  is  an  output  that  d^ends  only  on  being  in  a  particular  state  m-  i 
on  a  parUcular  transiUon  being  executed  V  hat  we  terr,  a  “  particular  state  or  only 

•»•  Us.  term  as  a  condiUonal  output  depends  not  onlv  on 

state  or  transiUon,  but  also  on  a  boolean  combinaUon  of  input  variables  ^ 

of  a“t^t! tir‘  r  immediately  upon  entry  into  a  state  or  upon  the  execuUon 

traLZ.  m  »h  ?  ««  uncondiUonally  operated  on  from  a  state  or 

transiUon  must  be  released  when  the  state  is  left  or  the  transiUon  is  completed 

CondiUonal  outputs  are  operated  on  when  the  enUre  oondiOon  becomes  true  including  entrv  to  a 

o?.iy.di7 ut  puipu.. 

rc'r;r.:.r.^::rrhr 
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Becus.  rf  the  hendeheke  jdnj  on  between  logiceil,  edjecent  states,  theta  1.  e  smdl  dno^t  ot 
Uine  » tte  mectan^s  iegelly  In  both  states  et  the  seme  Ume.  This  ellows  lor  ephemeral  outputs 
to  be  OBed  to  o  ghtch-free  menner  betwpen  logically  odjecant  states.  Enduring  outputs  eontaSled 
^  l^lcnlly  edie^t  ^tes  pose  a  problem  If  both  a  set  and  reset  are  attempted  at  the  same  Ume 

surtaundti^lf”'’"”'"'  ''  “■  ‘ 

In  Assassin,  there  is  no  implicit  communication  between  any  two  state-machines  spedfied  as 
part  of  the  same  control-unit  All  such  intei^tate-machine  oommunicaUon  is  accomplished  by 
explicit  signalling  protocols  uning  inputs  to  and  outputs  from  the  state-machines.  ^ 


4.  The  Implementation  of  Control 

The  actual  physical  implementaUon  of  control  depends  on  two  factors:  the  circuit  implementaUon 
techi^que  ^d  the  control-unit  implementation  technique.  The  circuit  implementaUon  technique 
should  be  picked  so  as  to  make  the  physical  realizaUon  of  the  control-unit  implementation  technique 
as  simple  as  possible. 

The  selecUon  of  a  control-unit  implementaUon  technique  depends  on  the  set  of  features  to  be 
implemented-  Thus  employing  Fork  and  JOIN  prohibits  using  a  monolithic,  completely  encoded 
control-unit.  Including  Fork  and  Join  in  a  control-unit  implementation  technique  requires  either  a 
very  complex  strategy  for  splitting  out  the  concurrent  secUons  of  the  control  into  physically  (and 
perhaps  logically)  separate  sections,  a  partially  encoded  scheme  where  the  sequenUal  control  secUons 
are  encoded  and  the  concurrent  are  not.  or  a  completely  decoded  machine.  The  one-hot  implemen- 
totion  IS  a  completely  decoded  scheme  in  which  Fork  and  JOIN  are  easily  included.  The  tradeoffs' 
involved  in  selecting  the  one-hot  strategy  are  discussed  by  Hollaar  [4].  ’  . 

Basically,  the  one-hot  strategy  involves  the  use  of  one  latch  for  each  state,  two  gates  for  each  ' 
transition,  a  latch  or  driver  for  each  output,  and  one  gate  for  each  condition  controlling  oondiUonil 

outpu^  from  a  given  state  or  transition.  For  complex  machines,  the  automaUc  full-custom  layout  of 
a  one-hot  control-unit  could  be  very  difficult. 

Path-Programmable  Logic  provides  a  very  regular  structure  that  is  particularly  well  suited  for 
implementing  one-hot  control-units.  In  the  mapping  of  control  onto  PPL  using  a  one-hot  encoding  a 
single  latch  is  used  for  each  state  variable.  Each  transition  maps  to  two  PPL  row  segments,  one’to 
set  the  next  stale  and  the  other  to  reset  the  current  state  once  the  next  state  has  been  set.  In 
addition,  complex  boolean  conditions  on  transitions  (or  on  outputs)  may  require  the  introduction  of 
temporary  gates.  In  PPL.  the  AND  of  several  inputs  is  detected  on  a  single  row.  The  OR  is  formed 
on  the  columns.  For  this  reason,  extra  PPL  columns  containing  temporary  variables  must  be  in- 
s^ted  for  forming  the  OR  terms  of  boolean  expressions.  Outputs  are  controlled  by  using  a  single 
PPL  row  to  drive  all  the  unconditional  outputs  controlled  by  a  state  or  a  transition.  Each  separate 
condition  for  controlling  conditional  outputs  uses  a  single  PPL  row. 


4.1.  The  Implementation  of  Control:  Floor  Plan 
t  itb  the  basic  mapping  strategy  defined  above,  we  soon  see  that  there  are  many  ways  to  spedfy 
the  global  organlzaUon  or  floor  plan  of  the  control-unit.  The  one  selected  for  use  in  ASSASSIN  was 
chosen  because  it  appears  to  be  simple  This  floor  plan  (see  figure  4-t)  has  the  state  latdies,  tem¬ 
porary  variable  inverters,  and  input  inverters'  in  a  single  band  across  the  middle  of  the  control-unit 
,  Output  latches  and  inverters  are  placed  in  a  band  across  the  top  of  the  control-unit  Inputs  arrive 
from  the  bottom  of  the  control-unit  and  outputs  are  emitted  from  the  top  of  the  control-unit  This 
stacking  of  inputs  and  outputs  results  in  a  significantly  smaller  area  and  is  a  direct  consequence  of 
using  a  PPL-like  structure  for  the  circuit  implementaUon.  State  transiUons  are  generated  in  the 
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bottom  half  of  the  control-unit  and  boolean  expressions  and  outputs  are  generated  between  the  state 
latdi  band  and  the  output  band.  It  is  pxsssible  to  make  other  area  optimizations  in  the  PPL  layout  of 
one-hot  control-units. 


Output  Latches  and  Gates 


Boolean  Expressions 
and 

0  utput  Generation 


State  Latches,  Input/Temp  Gates 


Transitions 


Figure  4—1:  Global  Organization  of  ASSASSIN  Output 

This  global  organization  results  in  a  simple  PPL  generator  that  needs  no  routing  tools  for  con¬ 
structing  the  control-unit  All  the  PPL  .generator  has  to  know  is  whidi  cells  to  place  and  where  to 
place  them  —  an  easy  problem  when  compared  with  routing. 

4.2.  The  Implementation  of  Control:  Code  Generation 

W  e  have  now  almost  fuliy  spedfied  the  entire  system.  All  that  remains  is  to  actually  construct 
algorithms  for  generating  PPL  programs  that  implement  the  control-unit.  The  self-timed  control- 
unit  requires  the  use  of  latches  for  representing  states.  These  latches  must  indicate  their  change  in 
state  after  the  set  or  reset  signal  has  arrived.  The  PPL  cell  designed  for  this  purpose  is  the  foui^ire 
latch.  It  contains  cross-coupled  NMOS  inverters  for  the  latch  with  inverting-buffered  outputs. 
Thus,  this  cell  cannot  signal  its  change  in  state  until  after  the  latch  has  changed  state.  ASSASSIN 
can  currently  generate  either  a  GIF  description  of  the  control-unit  or  a  file  written  in 
Computervision’s  CADDS2  External  Data  Base  format. 

Tha  transitions  for  a  self-timed  control-unit  require  two  row  segments.  The  first  senses  that  the 
machine  is  in  a  certain  state  —  say  state  A,  that  all  pcssible  predecessor  states  (states  which  could 
have  caused  a  transition  to  state  A)  have  been  reset,  and  that  the  oonciition  for  the  transition  is  met. 
If  all  these  conditions  are  met,  the  latdi  for  the  next  state  is  set.  If  there  are  outputs  controlled 
the  transition,  an  inverter  is  used  to  appropriately  control  output  generation  from  the  transition. 
The  second  row  segment  detects  that  the  next  state  has  been  successfully  set  and  resets  state  A. 

Figure  4-2  illustrates  a  simple  transition  between  two  states.  The  machine  is  in  slate  B,  having 
come  from  state  A .  State  A  has  been  reset,  the  first  row  below  the  state  latdies  performs  the 
''forward"  transition,  or  setting  of  the  next  state.  The  '0'  under  the  latci  for  state  A  detects  that  state 
A  has  been  reset.  The  '1'  under  the  latch  for  state  B  detects  that  state  B  has  been  set.  The  ‘1*  under 
the  inverter  for  input  1 1  detects  that  the  input  condition  has  been  met  and  the  'S'  under  the  latci  for 
state  C  will  set  state  C  when  the  transition  occurs.  The  second  row  performs  the  "reverse"  transition, 
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or  the  resetting  of  the  previous  state.  The  T  under  the  latdi  for  state  C  detects  that  state  c  has  been 
completed.  CompleUng  the  operations  of  both  these  rows  constitutes  a  complete  transiUon. 


I  III  I  I 

I  tltll  I  I 

I  A  I  I  I  B  I  C  I 
I  I1I2I  I  I 

I  III  I  I 

I  <  i  i  I  i  i  t 

10— P-1 1— SI 

I  I  i  I  i  i  i  I 

l-l-l-l-IR— -P-1 1, 


Figure  4-2:  A  Simple  Self-Timed  Transition 

A  synchronous  transiUons  are  different  from  self-timed  transiUons  in  that  they  do  not  sense  that 

the  same  section  ol 

control  as  m  figure  4-2,  implemented  asynchronously. 


I  III  I  I 

I  inn  I  I 

I  A  )  I  I  B  I  c  I 
I  I1I2I  I  I 

I  III  I  I 

*  i  i  i  i  I  t  t 

l-l-l-l l-P-l - si 

*  i  i  i  i  I  t  t 

l-l-l-l-IR - P-ll 


Figure  4-8:  A  Simple  A  syndironous  Transition 


Synd^nous  transiUons  are  implemented  the  same  as  asynchronous  transitions,  with  the 
Uon  that  the  state  latches  are  replaced  by  docked  nip-flops.  This  is  illustrated  in  figure  4-4. 


exoep— 


I  III  I 

I  III  I 

I  III  I 

I  III  I 
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I  A  I  I  I  B  I  C 
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I  III  I 
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Figui'e  4-4:  A  Simple  Syndironous  TransiUon 

Honal/rl\‘^r  ®  discussion  explains  the  ASSASSIN  compilation  of  all  the  constructs  described  by 

tigL  2  V'  Thr^inf^  control-unit  whose  flow-^raph  is  contained  in 

examnfr!'  «xle  for  this  control-unit  is  in  figure  2-4.  The  complete  PPL  program  for  this 

program  RoiJ^s  constructs  being  discussed  contain  portions  of  this  PPL 

g  “  Row  segments  are  referred  to  from  left  to  right  in  a  given  row.  Row  and  column  numbers 
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are  as  labeled  in  the  figures. 

Figure  4-6  illustrates  the  compilation  of  a  move  transition  (from  state  A  to  state  D).  Rows  17 
through  19  contain  tjie  state  latches,  input  gates  and  temporary  gates.  T 1  contains  "12  and  not  13." 
T2  contains  "14  or  15."  T3  indicates  that  the  JOIN  transition  from  states  B  and  C  to  state  F  is  cur¬ 
rently  being  taken.  T4  indicates  that  the  H  OVE  transition  from  state  D  to  state  P  is  being  taken. 
Row  15  is  the  forward  transition  from  state  A  to  state  D.  It  senses  that  state  \  is  active  by  the  ‘1’  in 
cdumn  1,  that  "BIG"  is  false  by  the  ‘0’  in  columns  2  and  3,  and  that  state  F  is  inactive  by  the  ‘0’  in 
column  22.  State  D  is  made  active  by  the  'S'  in  column  17  and  the  row  load  is  the  'P'  in  column  11. 
The  reverse  transition  in  row  14  simply  senses  with  the  '1'  in  column  17  that  state  D  is  active  and 
resets  state  A  with  the  'R'  in  column  0. 

Scale-of-two  loops  pose  a  particular  problem.  It  is  possible  to  get  stuck  in  both  states,  with  no 
way  to  get  out.  Soale-of-two  loops  therefore  require  some  sort  of  mutual  exclusion  on  transitions  to 
avoid  this  problem.  Figure  4-7  illustrates  the  compilation  of  a  scale-of— two  loop.  Row  5  contains 
the  forward  transition  froin  state  D  to  state  P.  Note  the  'O’s  in  columns  0  and  22  which  detect  the 
predecessors  to  slate  D.  The  '+  ’  in  column  18  is  used  in  generating  the  outputs  assodaled  with  this 
transition  by  driving  T4  when  the  transition  is  in  progress.  The  right  segment  on  row  12  resets  state 
D  after  the  forward  transition  to  state  F  has  been  finished.  Note  the  '!'  in  column  19  which  senses 
that  input  18  has  not  yet  become  false.  This  gives  the  required  mutual  exclusion  of  input  signals  in  a 
scale-of-two  loop.  Row  4  contains  the  forward  transition  from  state  F  to  state  D.  The  'O'  in  column 
19  detects  the  false  slate  of  input  18  and  the  other  'O’s  detect  the  inactivity  of  the  possible  pr^eoes- 
SOTS  to  state  F.  Row  4  contains  the  reverse  transition  associated  with  the  transition  from  state  P  to 
state  D.  The '0' in  column  15  senses  that  input  17  is  currently  false. 

Figure  4-8  illustrates  tlie  FORK  transition  from  state  A  to  slates  B  and  C.  Row  13  contains  the 
forward  FORK  transition.  It  senses  the  state  A  is  active,  that  state  F  is  inactive  and  that  input  BIG  is 
true  (the  'I’s  in  columns  2  and  3).  It  also  sets  both  states  B  and  C.  The  reverse  FORK  transition  is  in 
the  left  segment  ot  row  12.  It  detects  that  both' slates  B  and  C  have  been  activated  and  resets  state  A. 

Figure  4—9  shows  the  JOIN  transition  from  states  B  and  C  to  state  F.  Row  9  implements  the 
forward  transition  by  sensing  that  the  predecessor  state  (a)  is  inactive,  states  B  and  C  are  active, 
inputs  14, 15  and  16  are  true,  and  by  setting  state  F.  The  '+  ’  in  column  14  is  used  for  generating  the- 
outputs  associated  with  the  JOIN  transition  from  state  C.  The  reverse  transition  is  implemented  in 
row  8  where  the  activation  of  state  F  is  detected  and  stales  B  and  c  are  deactivated  (reset). 

Figure  4—10  shows  the  compilation  of  the  input  boolean  expression  BIG  —II  and  (12  or  not  13).  The 
leftmost  row  segments  on  rows  20  and  21  (1+  -1-Pl  and  I*  -P-01  respectively)  compile  the  subexpres¬ 
sion  "12  or  not  13."  The  '+  ’  in  column  3  generate  the  OR  of  these  two  rows  into  Tl.  12  is  sensed  by 
the  '1'  in  column  4  of  row  20  and  "not  13"  is  sensed  by  the  ’0’  in  column  5  of  row  21.  To  sense  "BIG", 
the  program  must  contain  'I’s  in  both  columns  2  and  3.  To  sense  "not  BIG"  it  must  contain  Os  in 
both  columns  2  and  3. 

Figure  4—11  shows  both  conditional  and  unconditional  output  generation  from  states  and  tran¬ 
sitions.  Row  22  implements  the  unconditional  outputs  controlled  by  state  A.  The  '1'  in  column  1 
senses  that  state  A  Is  active.  The  '+  's  in  columns  6  and  13  implement  the  "HOLD  01,02;" 
statement,  the  'S’  in  column  17  implements  the  "RESET  03"  statement  and  the  R’  in  column  10 
implements  the  "SET  04"  statement.  The  'S’  is  used  to  reset  a  LATCH 2  PPL  cell  and  the  'R'  is  used 
to  set  it.  Rows  24  and  25  implement  the  conditional  outputs  controlled  by  state  B.  Row  24  detects 
the  "14  or  15"  condition  arid  HOLDs  05  and  02  and  resets  04.  Row  25  detects  the  "13"  condiUon  and 

sets  03.  The  last  row  segment  ori  row  20  (ll-P - SI)  inmlements  the  unconditional  output  (03) 

controlled  by  the  JOIN  transiUon  from  states  B  and  c  to  F.  Row  26  implements  the  "if  BIG  then  set 
04"  statement  from  the  J OIN  transition  in  state  C. 
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Figure  4-6;  Compilation  of  the  H  OVE  Transition 
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Figure  4—7:  Compilation  of  the  Scale-of-Two  Loop 
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Figure  4-6:  CompilaUon  of  the  FORK  TransiUon 


5.  The  Assassination  of  a  Control  Unit 

This  section  illustrates  the  complete  design  of  a  non-trivial  state-machine.  The  control-unit  to^ 
designed  comes  from  the  Ada-to-Silicon  Project  underway  at  the  University  of  Utah.  This  prcject 
has  as  one  of  its  objectives  the  automaUc  transformation  of  Ada  programs  into  hardware  implemen- 
taUons  using  integrated  drcuits  [5].  The  Ada-to-Silicon  project  is  using  the  Internet  Protocol  (see 
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Figure  4-9;  Compilation  of  the  JOIN  TransiUon 
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Figure  4—10:  Compilation  of  Boolean  Expressions  —  BIG 
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Figure  4-11:  Compilation  of  Outputs 


into  Ihre.  communlcun, 

lii.rd..re(.i,dso.l*are)  submodules  [6).  Fisure  S-1  llluslrules  Ibis  division.  The  prolom!  eonsi^ 
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of  N  IVU  .  IN  submodules,  eadi  of  whidi  receives  transmitted  data  and  assembles  datagrams  frxsm  a 
single  local  area  network.  N  INU.OUT  submodules,  each  of  which  appropriately  fragments  and 
transmits  datagrams  to  a  single  local  area  network,  and  a  single  INU  .  SRV  submodule  that  interfaoes 
^e  N  INU.  OUT  and  N  INH.'IN  submodules  to  one  or  more  host  computers.  The  complete  Ada  code 
descnbing  the  IKU.OUT  submodule  has  been  written  and  compiled  and  willis  presented  in  a 
forthcoming  report. 


Figure  5-1:  Internet  Protocol  Hardware  Submodules 

The  INU.OUT  submodule  of  the  Internet  Protocol  has  been  selected  as  the  iniUal  test  case. 
Preliminary  Ada  code  in  the  form  of  a  complete  task  has  been  written  and  compiled.  INU.OUT 
consists  of  three  separate  tasks.  Main.  Read.  Init.  Parameters  and  Translate.  TO S_  Table.  Of 
these,  the  hardware  architectural  design  has  been  completed  for  the  Read.  Init.  Parameters  task. 
Read.  Init.  Parameters  deals  with  the  initialization  parameters  of  INU.OUT  and  loads  various 
registers  with  data  related  to  the  transmission  of  datagrams  through  a  local  area  network.  Il¬ 
lustrated  in  figure  5-3  is  a  blodc  diagram  of  the  hardware  implementation  of  this  task.  Professor  A1 
Davis  performed  the  mapping  of  the  iniUal  version  of  Ada  code  into  a  block  diagram.  Several 
modificaUons  have  been  made  since  that  time.  The  block  marked  "Read.  Init.  Pars  -FSM"  is  the 
control-unit  derived  from  the  Ada  code  for  the  Read.  Init.  Parameters  task.  Figure  5-2  contains 
the  Ada  code  for  a  section  of  Read.  Init.  Parameters.  The  complete  code  is  found  in  Appendix  . 

Figure  5-4  contains  the  control  flow -graph  for  the  Read.  Init.  Parameters  task  as  extracted  from 
the  A  da  program.  It  should  be  noted  that  this  particxilar  flow -graph  does  not  use  the  FORK  and  JOIN 
transitions  available  in  CUDL.  Indeed.  FORK  and  JOIN  will  probably  not  be  used  in  implementing 
tasking,  but  may  be  used  for  more  fine  grained  parallelism  based  on  data  independency.  Ada  aooept 
statements  are  translated  into_  requt  it-acknowledge  handshakes  with  the  appropriate  module. 
These  are  indicated  by  the  name  of  the  accept  (GO  or  SRV)  concatenated  with  ".REQ"  and  "ACK". 
State  RIPO  is  the  initial  state  of  the  machine  and  sends  initializaticjn  signals  to  several  of  the 
datapath  modules  in  the  environment  of  Read.  Init.  Parameters.  Of  particular  interest,  the  signal 
INITNUM  .REG.LOD  is  held  during  this  state.  This  signal  indicates  to  the  register  holding  the 
initialization  number  to  watch  the  associated  three-wire  bus  and  assume  its  value  at  all  times. 
W  hen  this  signal  is  dropped  (in  state  RIPl).  this  register  latches  the  value  on  the  bus.  The  firet 
aooept  statement  ("accept  G0{  ...  )  ")  is  begun  with  the  transition  from  state  RIPO  to  state  RIPl. 
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Figure  5-2:  ADA  Code  for  Rea<l_  Init-  Pararaeters 

N ote  that  the  oandition  for  Ihe  transiUon  indudes,  in  addiUon  to  GO.REQ,  INITNUM  .REG-DON  and 
INITNUH.CTR.DON.  The  machine  cannot  proceed  until  it  is  sure  that  the  initialization  number 
register  contains  the  correct  value  and  the  assodated  counter  has  been  reset.  In  state  RIPl,  the 
machine  begins  the  second  accept  loop.  W  hen  the  SRV. RE Q  signal  arrives,  a  transition  is  made  to 
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Figure-  5-4:  ControJ  FJ  w-Graph  for  Read.  Init.  Parameters 
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state  RIPS,  where  the  counter  is  incremented  (indicating  that  another  b3rte  of  address  is  to  be  trans¬ 
mitted  to  the  memory  module),  and  a  request-edknow  ledge  handshake  is  performed  between 
READ-  INIT-  PARS  and  the  memory  module.  The  signal  UEU.5END  indicates  to  the  memory 
that  it  is  to  receive  data.  W  hen  the  counter  has  been  incremented  (INITNUU  .CTR.DON)  and  an 
acknowledgement  from  the  memory  (H  EH  ACK)  have  been  received,  a  transition  is  made  to  state 
RIP2.  State  RIP2  terminates  the  handshake  with'the  INH_  SRV  module  by  asserting  the  signal 
SRVACK.  Once  both  SRV.REQ  and  HEH.ACK  have  been  lowered,  the  output  of  the  oomparator 
between  the  initialization  counter  and  register  is  examined.  One  of  the  two  transitions  from  state 
RIP2  is  executed  based  on  the  value  of  INITNUH  .CH P.EQ.  If  INITNUH  .CMP.EQ  is  on,  the  in¬ 
itialization  loop  is  terminated.  If  it  is  off,  the  initialization  loop  is  continued. 

The  memory  module  now  has  the  complete  address  of  the  parameter  blodc  which  needs  to  be 
transmitted  to  INU_OUT.  State  RIPS  begins  an  interaction  between  the  memory  module  and 
Read-  Ini t_  Parameters  that  loads  a  set  of  registers  appropriately.  The  handshake  with  the 
memory  module  is  begun  by  holding  H  EH  .REQ.  At  the  same  time,  the  register  counter  (which  was 
initialized  to  7)  is  incremented  (and  is  now  0).  When  an  .acknowledgement  is  received  from  the 
memory  (M  EM  A  CK ),  and  the  register  counter  is  finished  counting  up  by  one,  a  transition  is  made  to 
state  RIP4  where  the  signal  REG.DECODE.ENA  signals  the  appropriate  latch  to  gate  in  the  value 
from  the  memory  bus.  MEM  .REQ  is  left  on  here  so  that  the  valid  data  on  the  memory  bus  does  not 
disappear  before  it  can  be  latched.  W  hen  the  appropriate  register  signals  that  it  has  the  data  loaded 
(REG.ACK),  a  transition  is  made  to  state  RIPS.  W  hen  the  memory  acknowledges  the  termination  of 
a  transmission  cycle  (not  H  EH  .ACK),  a  comparator  with  the  register  counter  is  made  to  see  if  all 
required  registers  have  been  loaded  (REG.CTR.EQ7).  If  not,  the  loop  is  repeated,  incrementing  the 
register  counter  eada  time.  If  so,  a  transition  is  made  to  state  RIP6  and  the  processing  of  the  Type- 
of— Service  (TO  S)  table  is  performed.  .  .  :  .  . 

The  type-of-service  table  is  to  be  a  linear  array  of  registers  (or  ram  cells),  indexed  by  row  and 
cdumn.  Initially  this  indexing  was  done  via  a  multiplication  (in  the  Ada  code).  It  was  replaced  with 
a  doubly  nested  loop  to  make  the  hardware  implementation  easier'  and  more  straightforward.  In 
state  RIP6,  the  type-of-service  column  counter  and  type-of-servioe  address  counter  are  incremen¬ 
ted.  They  were  initialized  to  their  maximum  value  in  state  RIPO.  At  the  same  time,  a  handshi^e 
with  the  memory  module  is  begun  (by  raising  MEM  .REQ).  W  hen  the  memory  has  placed  the  data  on 
the  line  and  replied  by  using  HEH.ACK,  and  when  the  two  counters,  T05.C0L.CTR  and 
TOS.ADR.CTR  have  been  incremented,  a  transition  is  made  to  state  RIP7.  Here  the  T05  table  is 
signalled  to  load  the  value  from  the  memory  bus  (1'05.REG.L0D).  HEH  .REQ  is  held  high  so  that 
the  data  on  the  memory  bus  remains  valid.  W  hen  the  data  is  in  the  TOS  table,  T0S.REG.D0N  is 
asserted  and  the  next  state  becomes  RIP8.  This  stale  terminates  the  handshake  with  the  memory 
module.  W  hen  the  acknowledgement  from  the  memory  arrives,  if  all  columns  in  the  current  TOS 
table  entry  have  been  processed,  a  transition  is  mad'*  to  state  RIP9  to  proceed  to  the  next  TOS  table 
entry.  If  more  columns  in  the  entry  need  to  be  proceii’sd,  the  TOS. COL. CH  P.EQ  signal  will  be  false 
and  the  transition  from  state  RIPS  to  state  RIP6  will  be  UJeen. 

In  slate  RIP9,  the  column  counter  (TOS.COL.CTR)  is  cleared  and  the  row  ooimter 
(TOS.ROW  .CTR)  is  incremented.  W  hen  these  two  operations  are  complete,  the  next  state  becomes 
RIPA  where  a  check  is  performed  to  see  of  the  entire  TOR  table  has  been  loaded.  If  it  has  not, 
TOS.ROW  .CM  P.EQ  will  be  false  and  the  a  transition  occurs  from  state  RIPA  to  state  RIP6.  If 
TOS.ROW  .CMP.EQ  is  true,  the  output  G0.ACK  is  asserted,  terminating  the  "Accept  GO  (  ...  )" 
statement.  When  GO.REQ  is  lowered,  the  next  state  becomes  RIPO  to  begin  over  again  when 
necessary.  Figure  5-6  contains  the  CUDL  <»de  for  the  Read-  Init-  Parameters  state  madiine. 

The  CUDL  code  in  figure  5-6  was  run  through  ASSASSIN.  The  code  was  simulated  to  verify  that  it 
matched  the  flow -graph;  the  asscxdated  PPL  program  was  then  generated  through  compilation  of  the 
CUDL  code.  Figure  5-6  contains  a  plot  of  the  PPL  program  lor  the  Read-  Init-  Parameters  control. 


20 


ControlUnIt  Raadin  I  tPppH: 
StataHachln*  RIP: 


ASSASSIN 


StartStata  RIPRi 

no  d  In  I  tMui_CTR_CLR,  Ini  tNufc_REC_L00,  REG  CTR  riRXt 
hold  T0S_Col_CTR_l1RX,  TOS_Roh_CTR_CLR,  TOs“fiOR“cTR  IIRXj 


to  RIPl, 


atata  RIPl: 

■ovaon  SRV_Raq  to  RIPIR, 
and; 


atata  RIPIR: 

■  ovaon  l1Et1_flcX  and  InltNuin_CTR  DON  to  RIP2; 
•’“•‘'.UEJ-Raq,  HEn.Sand,  InltNuii_CTR  INC: 
aat  CO.Raaponaa; 
and; 


atata  RIP2: 

■ovaon  not  SRy_Raq  and  (not  nEn_RcX  and  InItNun  CAP  EQ)  to  RIP3i 

hol'd“sRv'’fljr"”*'’  'Kl  not  InltNumZcnPlEQ)  to  RIPl; 

and; 

atata  RIP3: 

aovaon  l1EI1_RcX  and  Rag_CTR_00N  to  RIP4; 
hold  HEII.Raq,  Rag.CTR  INC;  ’ 

and; 


atata  RIP4: 

aovaon  Rag_RCX  to  RIpS; 
hold  l1EI1_Raq,  Rag_0acoda  ENR; 
and; 

atata  RIPS: 

Povaon  t<ag_CTR_EQ7  and  not  l1En_ficX  to  RIPS; 

povaon  not  Rag_CTR_EQ7  and  not  I1EI1  RcX  to  RIPS- 
•nd;  ’ 


■  tato  RIPGi 

«T0S_CoI_CTR_00N  and  TOS  Rdr  CTR  DON) 
hold  l1EI1_Raq,  T0S_Co  l_CTR_INC,  TOS.Rdr  CTR  INC:“  “ 

andj  —  —  I 


to  RIP7; 


atata  RIP7: 

Povaon  T0S_Rag_00N  to  RIPS; 
hold  T0S_Rag_L00,  l1EI1_Raq; 
and; 


atata  RIPS: 

Povaon  not  l1EI1_RcX  and  T0S_Col  CI1P  EQ  to  RIPS: 
Povaon  not  l1EI1„RcX  and  not  T0S_Co I_CI1PIeQ  to  RIPS; 
and  j  ' 


atata  RIPS: 

Povaon  T0S_CoI_CTR_D0N  and  TOS  Rou  CTR JDOl 
hold  T0S_CoI_CTR_I1RX,  T0S_Roh_CTR  INC; 
and;  “  ' 


to  RIPR; 


atata  RIPR: 

Povaon  not  T0S_Roh_CI1P_EQ  to  RIPS; 
povaon  not  C0_Raq  to  RIPS; 

If  T0S_Rou_CI1P_EQ  than  hold  COflcX; 
and; 
and; 
and. 


B  igore  5-6:  CUDL  Code  for  Read.  Init—  Parameteir  Control 
Figure  5—7  shows  the  composite  layout. 

The  compilaUon  of  the  control  unit  took  approximately  2  minutes  of  DEC-System  20  CPU  time 
The  resulUng  circuit  is  2028  microns  by  1050  microns  (39  PPL  columns  by  30  PPL  rows  using  6- 
micTon  geometry).  The  datapath  related  to  the  Read.  Init.  Parameters  task  cannot  be  layed  out 
until  the  relaUonship  of  some  of  the  registers,  which  represent  global  variables  (with  respect  to 
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Figure  5-7:  Composite  Layout  CNH  OS)  for  Read-  Init_  Parameters  Control 
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Read.  Init—  Parameters),  with  other  associated  control  and  datapath  elements  has  been  established. 


6.  Conclusions 

Assassin  demonstrates  several  Jignifioant  points. 

1.  Consol  can  be  specified  at  an  abstract  level  and  then  automaUcally  and  easily  imple- 

dn  integrated  rircAiit  module.  It  is  possible  to  map  control  specified  at  wen 
higher  levels  of  abstraction  to  something  ASSASSIN  undersUnds,  thereby  enabling  us  to 
make  progress  toward  a  true  silicon  compiler.  Sudi  work  is  reported  in  [1 1]. 

2.  Self-timed  (or  asynchronous)  oontrol-nnits  with  concurrency  can  be  easily  implemen¬ 
ted.  ASSASSIN  shows  that  the  control  for  self-timed  machines  can  be  designed  with 

t*«>lArivo 


3.  The  succesrfuj  i^e  of  Path-Programmable  Logic  in  ASSASSIN  shows  that  PPL  has  great 
value  as  a  drcuit  implementation  technique,  at  least  for  this  type  of  control-unit  This 
also  shows  that  PPL  is  indeed  amenable  to  the  development  of  sophisticated  CAD  tods 
that  use  It  as  the  underlying  circuit  implementation  technique 


4.  The  mapping  of  Adas  nci  set  of  control  constructs  is  very  straightforward  as  il¬ 
lustrated  by  the  generation  of  the  control  for  the  Read.  Init_  Parameters  task. 
ASSASSIN  reprraents  a  step  forward  in  the  design  of  integrated  drcuits  by  allowing 
lay  descriptions  of  integrated  drcuit  modules  to  be  automatically  cnmpiled  to  a 
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L  The  Syntax  for  ASSASSIN 

The  following  is  d  BNF  descripUon  of  CUDL  -  the  Control  Unit  Description  Language 
lowing  are  to  be  used  in  understanding  this  descripUon:  . 


■  a  non-terminal  symbol 
Ij  -  0  or  more  repetitions 

i,-  -  is  defined  as 

I  -  OR 

language  terminals  are  indicated  by  uppercase 


<contro l-un i t> 
<idBnti f ier> 


CONTROLUNIT  <identifier>  : 
|<input-descriptor>J  <sm-list>  END  . 

< letter>  <id-tai I > 


<id-tai I > 


<letter>  <id-tai l>  I 
< lBtter>  I  <digi t> 


<digi  t> 


<id-tai l>  I 


<iriput-descriptor>  :=  INPUTS:  <  input-reduct  i  on-l  i  st> 

< input-reduct i on- 1 i st  >:  =  <reduct i on-statement > 

<input-reduct ion-1 i st>  I 
<reduct i on-statement > 


<reduct i on-statement > 
<condi t ion> 

<term> 

<pr i mary> 

<sm-l i st> 


<identifier>  ;=  <condit(on>  ; 

<term>  OR  <condition>  I  <term> 

<primary>  I  <primary>  AND  <term> 

<identifier>  I  ,  ( <condi t ion>)  I 
NOT  <primary>  I  TRUE  I  FALSE 

<sm-descr iptor>  I 
<8m-descr iptor>  <sm-list> 


<sm-descriptur> 

<sm-type> 

<6tate-l ist> 

<state-descr iptor > 

<etate-name-l i st> 


:=  <sm-type>  STATEHACHINE  <identifier>  : 
<8tate-l ist>  END  ; 

:=  SELFTinED  I  ASYNCHRONOUS  I  SYNCHRONOUS 

s=  <state-descr iptor>  I 

.<8tate-descr iptor >  <state-list> 

:=  STARTSTATE  <8tate-name>  : 

<8tatement-l ist>  END  ;  I 
STATE  <8tate-name>  : 

<8tatement-i ist>  END  ; 

:=  <state-name>  ,  <8tate-name-l  ist>  I 
<8tate-name> 


<8tata-name> 


<ldent if ier> 


<etatement- I iet> 


<8tatement> 


:=  <8tatement>  ;  <statement- 1  i 8t>  I 
<statement> 

:=  <tran8i tion-8tatement>  I 
Oct  ion-statement> 


The  fol- 
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<tran8i  tlon-6tatement>; =  <tran8i t lon-op>  <tran8ition> 

<tran8ltion-op>  :=  flOVEDN  I  FORKDN  I 

JOINS  <8tat8-name- 1  i  8t>  ON 

<tran8itlon>  :=  <condition>  TO  <8tate-nani8-l  i 8t>  ;  I 

<conditloni>  TO  <8tate-name-l  i 8t> 

DOING  <action-8tat8m8nt-l  i8t>  ; 

<action-8tatenient-l  l6t>;=  <act ion-8tatet!ient>  I 

BEGIN  (<action-8tatement>  ; {  END 

<act i on-8tatement>  ;=  <action-op>  <output- I i 8t>  I 

<i  f-act i on-8tatet!ient> 

<acti.on-op>  :=  HOLD  I  SET  I  RESET 

<output-l  i  8t>  :=  <output-name>  ,  <output- 1 i 8t>  I 

<output-nam8> 

<output-name>  :=  <identifier> 

<i  f-action-8tatement>  :=  IF  <condition>  THEN 

<act  ion-statement- 1 i 8t>; 


n.  Ada  Code  for  the  Read„  IniL_  Parameters  Task  of  the  INM_  OUT  Submodule 

(Inn.Out.nodula) 

taaK  body  Raad.In I t_ParaBa tara  la 
—  Rccaaaad  globala: 


—  numbar.of_loca l_na t_typaa_of_aarvl ca: 

—  loeal_nat_typa_of_aarvlca_tabla_rou_alzai 

—  toa.tablai 

--  Local  varlabla  daclaratloni 


—  Tha  following  varlabla  la  connantad  out.  It  appaarad  only  In  tha 

—  ‘high-laval*  uaad  to  road  In  tha  TOS  tabla.  Saa  balou. 

—  nuiiibar_ol_toa_tabla_octatai  Intagar  ranga  2  ..  Bax_toa_tab la_t Iza  -  I} 

octat.raglatan  octat.typaj 

bag  I  n 
loop 

accapt  Co ( 

In  I t.nuB.loraa 1 1  bltSj 

raaponaa;  out  out_raaponaa) 

do 

raaponaa  ib  aant.oKj  —  Rlao  Baans  Init.ok. 

—  Gat  froa  tha  aarvar  all  of  tha  addr_ehunKa  naadad  to  fora  tha 

—  baaa  addraaa  In  aaaory  that  ho  Ida  tha  Initialization  paraaatars 

—  and  aanda  thaaa  chunKa  to  tha  Haaory  aodula. 
for  Indax  In  1  ..  In  I t^nua.foraa I 

loop 

accapt  Srv_raq(  —  Cat  naxt  addraaa 

—  chunk  froa  tha 
—  Sarvar  llodulu. 

aarvar_coaaand_datua:  arv.coaaandj 

raaponaa_to_aarvar:  out  out.raaponaa) 

do 

naaory_raquaat (  --  Put  chunk  out  to 

—  tha  naaory  aodula. 

raquaat_typa_f oraal  b>  load_addraaa, 
chunk_of_addraaa_foraal  b>  aarvar.coaaand.datua, 
octat_foraal  '  *>  dont.cara.octat)  j 

and  Srv^aqi 


octa  t_typa 
octat.typa 
octa  t_buf  far_typa 
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•nd  loop; 

~  l**<lon  paranitira  (containad  In  tha 

loop 

f1onory_raquaat  ( 

raquaal  lypa_for»al  n  >  raca  lva_dalu»_octal. 
chunlc_of_addraaa_{or»aJ  » >  donl_capa_X  datuo, 
octat.foroai  *  >  octat_pagla7ar) } 

eaaa  indax  la 

Hhan  1  «>  jnp_»ax_pacl(al.  lo  :b  octal  pagiatapi 

uhI:  I  ,  .=  ocl.tlpa^lil.P 

I  t  "■-•^raaa.langlh  ,b  octat.pa^atap 

«h!n  I  B>  hi  ..  octat_pa|latap 

when  6  B>  ack.iypa  oclr  :_paglalap; 


Hhan  7  B>  local_nal_typa_of_aapvlca..labla_poHlalia  ’ 

Hhan  8  B>  nu»bap_ot_loea  l_nal_lypaB_or_a7pl*  cT*® 

and  e»a,  »  «="  octa  l_pag  la  lap, 

and  loop; 

—  Raad  in  typa  of  aapvica  tpanaiation  labia, 
dac  iapa 

POH_nuabap:  intagap  panga 

.1  u  8  ..  nuBbap_o  4_ioca  .'_nat  typai  o4  aapvieai 

col_nuabap:  intagap  pjnga  -»p  ■•rvica, 

8  ..  local_nal_lypa_ol_Bapvlca_poH_alia, 

indaxi  Intagap  panga 

8  ..  nuBbap_ot_loca  l_nat_lypaa_o''t  aapvlea 
a  I  oca i_na t_typa_o{_Bap vi ca  poh  alfa 

• ■  J  B  8;  ”  .  . 

bagin  ’  ’  -  ’  • 

POH.nuabap  i  k  8; 

'"col.nuBbaP  ,B  8,  '■  *"  »**>'•• 

"’SSaoPy.PapHaaK  ”  !"'’*" 

paquaa  t_typa_{orBal  b>  paca  i  va.datua  octal,’ 
chunlc_of_addpaaB_foPBal  e>  donl_capa_X  datua, 
oclal.lopaal  «>  loa.lablaTlndax)) ; 

coi_nuab6P  <b  coi.nuabap  +  1; 

axil  Hhan  col.nuabap  r  I  oca l_na l_lypa_of_aopvlca_roH_i lia; 

indax  >B  indax  -f  1; 

H  Indax  >  aax_loB_tab  la_B  lia  than 
POBponaa  :s  bad_BPv_coaBand; 

patupn;  —  Exit  tha  cuppant  accapt  atataaant. 

t  no  I f { 

and  loop;  —  End  Innap  loop. 

POH_nunbap  :  b  P0H_nuBbap  +  1; 

axil  uhan  POH_nuabap  =  nuabap.o 4_typaa_o{  aapvicai 
and  loop;  __  End  outap  loop.  ~ 

—End  daciapa  block. 


and  Co; 
and  loop; 
and  Raad_Inll_Pa 


—  End  ol  init  ppocaaaing. 

—  End  of  outap-aoat  (intinita) 

—  loop. 
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Abstract 


W  e  discuss  the  design  of  a  program  transformation  system  that  is  geared  to  aid  in  the 
automated  design  of  spedal  purpose  architectures  (drcuits),  given  a  high  level  spedfication  of  a 
3  problem.  The  synthesis  of  systolic  implementations  is  outlined,  and  examples  of  syntactic  forms 

that  aid  in  the  description  of  sudi  architectures  (and  algorithms  tailored  to  them)  are  given. 
Finally,  we  summarize  the  results  of  applying  the  methodology  in  synthesizing  several  dasses 
of  systolic  designs  (proceeding  from  abstract,  axiomatic  spedfications),  and  in  the  VLSI 
implementation  of  an  A  da  program  fragment  describing  a  part  of  the  DoD  Internet  Protocol. 
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1.  Introduction 

The  need  for  design  methodologies  for  spedal  purpose  VLSI  circuits  that  help  combat  the 
spiralling  complexity  and  cost  of  current  day  integrated  circuit  designs  is  by  now  well 
established  [15].  W  e  believe  that  it  is  also  important  that  such  methodologies  enable  a  smooth 
embedding  of  the  resulting  circuits  into  larger  systems  that  consist  of  both  software  and 
specialized  hardware  components  e.g.,  on  board  control  systems.  In  this  context,  we  have  been 
exploring  the  use  of  high  level  languages  as  a  medium  for  specifying  the  desired  behavior  of 
special  purpose  systems,  as  well  as  paradigms  for  mapping  sudi  specifications  into  VLSI 
architectures  [26, 25,  11].  We  are  currently  developing  a  set  of  automated  tools  for 
transforming  axiomatic  and/or  high  level,  language  program  specifications  in  Ada  into 
integrated  software-hardware  systems  [12,  13].  In  this  paper  we  describe  some  of  the  details  of 
the  design  of  our  transforrhation  system,  and  in  particular  the  manner  in  which  the  language 
constructs  influence  the  architecture  of  the  final  madiine.  W  e  then  indicate  some  ways  in 
which  parallelism  may  be  exploited,  and  how  systolic  designs  may  be  synthesized.  Syntactic 
constructs  suitable  for  describing  the  behavior  of  special  purpose  architectures  are  also 
discussed.  Finally,  some  preliminary  results  in  applying  the  methodology  to  non-toy  examples 
are  outlined;  these  include  various  classes  of  Systolic  designs  and  a  hardware  implementation  of 
an  A  da  program  fragment  that  describes  a  part  of  the  Department  of  Defense  Internet  protocol. 


1.1.  Overall  Approach 

W  e  first  summarize  briefly  our  overall  apprcjach  to  the  design  of  integrated  software- 
hardware  systems. 

The  initial  specafications  are  annotated  Ada  programs.  The  "annotations"  [9,  8,  22]  allow  for 
a  statement  of 

1.  Abstract  axiomatic  specifications  of  the  behavior  of  a  system,  including  statement, 
of  temporal  characteristics. 

2.  Performancje  requirements  to  be  met  by  an  acxsptable  implementation  along 
various  dimensions  of  interest  e.g.,  area,  time,  response  time,  throughput, 
reliability  etc. 

3.  Relevant  characteristics  of  the  external  environment  a  system  is  designed  to 
operate  in  e.g.,  external  Lining  constraints,  relative  function  application 
frequencies,  etc. 

Given  either  abstract  specifications,  or  an  Ada  program,  or  a  cximbination,  the  following 
transformations  may  now  be  attempted: 

—If  the  initial  specifications  are  axiomatic,  then  these  may  be  directly  translated  into 
an  implementation  suitable  for  being  cast  into  silicxm  [25]. 

—Alternatively,  the  abstract  -  specifications  may  be  transformed  into  an 
Implementation  using  primitives  available  in  typical  high  level  languages  e.g., 

Ada  [23]. 

—The  high  level  language  programs  may  be  transformed  into  hardware 
implementations  [12]. 

In  essence,  the  annotated  Ada  specifications  may  be  transforrcied  into  any  desired  mixture  of 
software  programs  and  special  purpose  hardware.  The  transformation  into  hardware  is 
attempted  in  two  phases:  the  output  of  the  first  phase  is  a  symboiic  descripUon  of  the  hardware 
implementation,  which  is  then  transformed  into  a  set  of  masks  suitable  for  actually  fabriciUng 
the  circuit.  The  latter  translation  uses  a  program  that  automatically  generates  layouts  for 
asynchronous  control  units,  given  their  symbolic  description  [3];  the  layout  of  the  data  paths  is 
currently  done  interacH.ively  using  existing  relatively  low  level  design  aids  (e.g.,  a 
ComputerVision  system). 

The  symbolic  description  of  the  hardware  implementation  is  cnucdied  in  an  extended  Ada 
syntax,  by  using  "macros"  for  describing  specialized  hardware  structures  and  algorithms 
tailored  to  them.  Two  major  reasons  for  the  use  of  such  syntactic  extensions  are  that  (1)  we 
have  found  it  cdumsy  to  describe  certain  kinds  of  cxincxirrency  (both  at  a  high  and  low  level)  if 
we  are  exmstrained  to  use  existing  Ada  program  structures;  (2)  specialized  primitives  are  very 
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often  more  appropriate  for  succinctly  describing  algorithms  that  are  tailored  to  special 
architectures. 

W  e  have  found  that  the  problem  decomposition  strategy  and  the  configuration  of  target 
structures  chosen  is  very  often  critically  influenced  by  the  desired  performance  requirements 
and  the  complexity  measures  associated  with  the  target  primitives  available.  This  "strategy 
guidance"  may  be  done  either  by  using  automated  complexity  computation  aids  [17,  24]  or 
interactively. 

A  typical  transformation  scenario  can  roughly  be  divided  into  two  "phases”:  (1)  an  analysis 
phase,  wherein  some  global  information  relating  to  the  pi’og  ram /specification  is  gathered;  and 
(2)  a  synthesis  phase  wherein  the  implementation  is  built  up.  The  analysis  phase  typically 
requires  an  examination  of  the  entire  program;  this  is  usually  done  by  traversing  the  parse  tree. 
The  synthesis  phase  is  typically  incremental  in  nature,  and  involves  the  use  of  the  information 
gathered  in  the  analysis  phase  and  (optionally)  further  information  of  a  more  specific  nature 
(i.e.,  not  computed  in  the  analysis  phase)  which  may  involve  non-local  analysis. 

In  essence,  therefore,  there  is  >  cxjmmon  set  of  global  properties  needed  for  guiding  the 
transformations  which  is  profitably  gathered  in  what  we  henceforth  refer  to  as  the  (global) 
"analysis"  phase,  and  a  set  of  more  specific  properties  that  are  better  computed  if  and  when 
needed.  This  separation  into  two  phases,  albeit  somewhat  nebulous,  allows  for 

—Conceptual  cdarity 

—Improved  efficiency  (because  global  traversals  tend  to  be  comparatively  expensive) 

—Added  flexibility  in  "global"  decision  making,  since  one  is  not  forced  to  make  an 
implementation  decision  too  prematurely. 

The  remainder  of  this  paper  is  organized  as  follows.  In  the  next  section  we  discuss  the 
trar^formation  of  specific ilasses  of  syntactic  constructs  in  Ada  into  hardware  struciures.  In 
section  3,  we  focnis  on  a  few  of  the  strategies  useful  that  enable  us  to  exploit  parallelism,  and 
then  delineate  the  development  erf  systolic  designs  (proceeding  from  either  abslracrf 
specifi cations  or  from  A  da  programs).  W  e  describe  some  examples  of  syntactic  constructs  that 
aid  in  the  succinerf  symbolic  description  of  systolic  designs,  and  in  the  transformation  process.  In 
appendix  1,  we  summarize  the  results  of  applying  the  methodology  in  the  transformation  of  a 
fragment  of  an  Ada  program  specifying  the  Department  cf  Defense  Internet  Protocol  [16]  into  a 
hardware  implementation. 


2.  Transformation  Strategies 

Tf  e  outline  here  a  set  of  transformation  strategies  that  we  have  developed  for  some  of  the 
commonly  used  syntactic  constructs  in  Ada  [1].  These  can  be  broadly  dassifled  into  either  a 
"direct"  (in  situ)  transformation  of  the  language  construct,  or  an  "indirect"  one,  involving  srae 
optimization  and  flow  analysis.  The  latter  can  be  thought  of  as  a  set  of  source-to-souroe  i.e., 
Ada-to-Ada  transformations  that  account  for  the  desired  optimizations,  followed  by  "direct" 
transformation.  For  the  examples  discussed  in  this  paper,  the  target  hardware  model  assumed 
is  an  asynchronous  one  [2]  wherein  state  transitions  controlled  by  request-acicnow ledge 
protocols  that  are  implicitly  embedded  in  the  underlying  model. 

To  fadlitate  exposition,  we  consider  the  Ada  constructs  in  order  of  increasing  complexity  so 
that  we  can  use  the  examples  for,  say,  an  assignment  statement,  in  an  if  statement.  Tf  e  split 
the  basic  constructs  into  two  classes.  The  dedarative  constructs  serve  to  determine  the 
collection  of  registers,  the  storage  elements  and  the  data  paths  between  them.  W  e  refer  to  this 
as  the  "environment"  part  of  the  dhip.  The  statements  in  the  body  of  the  program  determine  the 
(ensemble  of)  state  machine(s)  that  constitute  the  "control"  part  of  the  diip.  It  is  to  be  noted 
that  this  distinction  is  not  very  rigid,  since,  in  general,  the  environment  part  of  the  dreuit  is 
affected  by  the  statements  and  other  constructs  p  esent  in  the  procedural  part  of  the  program, 
and  vice-versa. 

The  statement  part  of  a  program  may  in  turn  be  viewed  as  contributing  to  either  intertask 
communication  or  intratask  computation.  W  e  envision  an  Ada  task  as  a  "standalone"  dreuit 
which  is  capable  of  communicating  with  other  (oo— )tasks.  Since  the  A  da  language  spedfication 
does  not  detail  the  manner  of  this  intertask  communication,  except  for  asserting  that  the 
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underlying  madiinery  ensures  the  existence  of  an  asyndironous  protocol  (where  the  selection  of 
ready  tasks  may  at  times  be  hon-deterministic),  we  in  fact  implement  an  explicit  interfacing 
madiine  whidi  handles  oommunication  with  other  tasks.  Its  purpose  includes  maintaining 
information  about  the  availability  of  the  task  madiine  for  calls  from  the  "outside",  its  allocation 
to  different  callers  depending  on  any  priority  mechanism  that  might  be  desired,  and 
maintaining  queues  to  allow  for  conflicts.  A  detailed  discussion  of  various  intertask 
communication  strategies  and  the  trade-offs  involved  is  contained  in  a  companion  report  (see 
also  [12]). 


2.1.  Declarative  Constructs 


2.1.1.  Object  Declarotions 

There  are  two  kinds  of  object  dedarations  in  the  Ada  language  —  those  whidi  dedare 
identifiers  to  be  of  a  prededared  subtype,  and  those  that  dedare  them  to  be  arrays.  Of  the 
prededared  subtypes,  the  most  basic  are  the  language-defined  primitive  subtypes,  integer,  real 
and  boolean.  A  dedaration  of  an  identifier  (or  an  identifier  list)  to  be  of  any  of  these  types 
results  in  its  implementation  being  selected  from  a  libraiy  of  available  primitives.  For  integers 
this  presently  consists  of  registers  and  RA  M ’s.  The  registers  used  for  integer  implementation 
are  in  turn  made  up  of  flip-flops  varying  in  complexity  from  simple  flip-5lops  to  two-phase, 
read-write  flip-flops.  The  dioice  among  these  alternatives  depends  on  the  results  of  global 
data-flow  analysis.  Reals  are  implemented  as  spedal  floating  point  registers,  along  with  an 
encoding  scheme  and  spedal  arithmetic  functions.  Some  booleans,  depending  on  the  results  of 
global  analysis  may  be  found  to  be  redundant  in  the  drcuit.  These  may  result  in  their  being 
implemented  as  combinational  circuitry  that  computes  their  value  at  all  instants.  The  booleans 
that  cannot  be  diminated  in  this  manner  are  implemented  as  single  flip-flops. 

If  the  object  dedaration  is  an  array  dedaration,  this  is  usually  implemented  as  RAM 's  of  the 
appropriate  primitive  type.  The  range  of  values  that  the  variable  can  assume  is  used  to  compute 
a  default  maximum  size  for  the  RAH  which  is  further  narrowed  down,  if  possible,  by  using 
global  analysis. 

For  object  dedarations  that  dedare  identifiers  to  be  of  some  non-primitive  type,  the 
transformation  system  implements  them  as  spedfied  in  the  implementation  of  the  type 
declaration  for  the  particular  type. 


2.1J3.  Type  Declarations 

An  Ada  type  dedaration  defines  a  new  dass  of  objects.  This  can  either  be  a  simple  range 
restriction  on  the  predefined  Ada  t3T)es  viz.  integers  and  reals,  an  enumeration  type,  an  array 
type  definition,  an  access  type  definition,  a  derived  type  definition  or  a  private  type  definition. 
For  every  type  definition  the  transformation  system  maintains  information  about  a  default 
implementation  in  a  predetermined  template.  W  hen  transforming  object  declarations  of  this 
type,  this  information  is  used  to  guide  the  particular  implementation  strategy  adopted.  The 
stored  informaUon  is  incrementally  refined  when  global  analysis  is  performed  on  identifiers 
declared  to  be  of  the  particular  type.  -Currently  this  is  spedfied  interactively  by  the  user. 

If  the  type  dedaration  is  a  restricted  range  on  a  predefined  A  da  type,  the  limits  of  the  range 
are  either  constant  or  variable  identifiers.  The  first  case  implies  a  direct  upper  limit  on  the  size 
of  all  identifiers  that  are  dedared  to  be  of  the  type,  and  this  information  is  added  to  the 
template  implementation.  If  the  limits  of  the  range  are  identifiers,  the  results  of  global  data¬ 
flow  analysis  for  the  identifier  are  used  to  establish  an  ripper  bound  on  the  range,  and  this 
information  is  stored  in  the  template. 

A  Iternatively  an  A  da  type  dedaration  may  define  arrays,  enumerations,  records  and  access 
types.  Currently  the  default  array  implementation  consists  of  either  RAM’s  or  ROM's.  The 
ranges  of  the  indexing  variables  determine  the  size  of  the  RAM ,  and  the  range  of  the  type  of 
individual  objects  in  the  declared  array  govern  the  word-size  of  the  RAM  .  Since  determination 
of  minimum  storage  at  compile  time  is,  in  general,  a  computationally  impossible  task,  we  have 
a  default  maximum  on  the  size.  The  transformation  system  finalizes  this  dedsdon  after 
interacting  with  the  user.  Sometimes  the  user  is  able  to  spedfy  the  sizes  more  restrictively  than 
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the  system  ever  could  because  of  a  more  thorough  understanding  of  the  program.  This  aspect  of 
the  transformation  system  is  more  critical  than  in  conventional  compilers,  because  of  our 
special  target  medium. 

For  enumeration  types,  the  transformation  system  determines  a  minimal  binary  encoding  for 
the  set  of  objects  declared  in  a  reasonably  straightforward  manner.  All  future  referenora  to 
these  enumerated  constants  are  translated  to  a  reference  to  one  or  more  of  these  encodings.  The 
other  type  definitions  are  somewhat  more  complicated,  but  the  underlying  theme  of 
determining  a  default  implementation  for  them  is  carried  over,  and  a  template  is  maintained  to 
hold  this  information. 


2.1.3.  Renamiiig,  Use  and  W  ith  DTClaiations 

Renaming  (and  equivalent  use /with)  declarations  are  used  in  Ada  to  provide  new  names  for 
identifiers,  particularly  if  the  identifier  is  declared  in  a  different  program  imit  They  do  not 
imply  that  a  separate  copy  is  maintained,  but  are  simply  a  notational  convenience.  As  far  as  the 
chip  architecture  is  concerned,  they  indicate  the  necessity  of  running  a  bus  between  two 
modules  to  make  the  variable  available  to  both.  (It  is  also  possible  to  have  duplicate  copies,  and 
ensure  that  consistency  is  maintained,  but  this  approadi  is  not  currently  used  by  the 
transformation  system.)  In  cases  where  the  whole  circuit  occupies  more  than  one  chip,  or  when 
the  modules  are  physically  placed  far  apart,  renaming  declarations  enable  some  flexibility  in 
exactly  which  module  contains  tho  actual  instance  of  the  object  dedared.  (W  e  currently  prefer 
to  rely  more  on  use/with  dedarations,  since  too  heavy  a  use  of  renaming  dedarations  leads  to 
more  human  errors  that  are  not  so  easy  or  impossible  for  a  compiler  to  detect) 


2.1.4.  Subprogram.  Package  and  Task  Declaratioiis 

These  kinds  of  declarations  have  been  grouped  together  because,  in  general,  they  are  all 
program  units.  Thus  they  indicate  the  presence  of  different  computational  modules.  The  scoping 
rules  of  Ada  determine  how  these  modules  access  variables  present  in  other  modules,  and 
govern  the  generation  of  additional  communication  circuitry  if  necessary.  If  a  subprogram 
module  has  more  than  one  potential  calling  module  it  becomes  necessary  to  provide  some 
arbitration  between  possible  conflicts.  This  is  currently  done  interactively,  where  the  user 
either  spedfies  the  arbitration  drcuitry  or  (usually)  tells  the  system  to  assume  that  no  conflict 
will  occur. 


2.2.  Imperative  Constructs 

2.2.1.  Assignment  Statements  (involving  simple  variables) 

The  general  form  of  an  assignment  statement  is 

<Identifier>  ;=  <EKpres8ion> 


The  "code"  for  the  target  machine  is  generated  by  a  top-down  traversal  of  the  parae  tree.  The 
transitions  in  the  asynchronous  target  machine  coindde  with  the  order  of  node-visits  in  the 
top-down  traversal  of  the  parse  tree.  W  e  illustrate  the  method  with  the  familiar  example  as 
shown  below.  Consider  the  simple  assignment  statement 

a  :  =  b*2-4*a*c; 
with  the  abstract  parse  tree  as  shown  below  (Figure  2-1). 

The  root  of  this  tree  is  mapped  into  a  state,  DoAssignment,  whldi  sends  requests  to 
subordinate  states  which  perform  the  computations  required.  W  hen  it  receives  acknowledge 
signals  from  all  such  secondary  states  it  causes  the  result  to  be  "load"ed  into  the  LHS  of  the 
statement  Here  the  code  for  "computing"  the  LHS  is  trivial  since  the  LHS  of  the  statement  is  a 
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Figure  2-1:  A  bstracl  parse  tree 
..n.b,a.  The  only  other 
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[he?e  tSo  states,  followed  by  code  to  from  "DoAssignment-  issue  cdls  to 

“’'feSr‘‘”w  tt°e‘eubmrtion“dr"ury  h.e  been  Se n  4wr4  « 

task  body  StateDoAssignment  is 

fork  ton 

end  ForkToComponents; 
end  StateDoAssignment: 

The  code  ton  Win.  the  neenlu  o,  the  t.o  ■W,  end  edn.U,  londln.  the  neenl.  .  .» 
follows. 

task  body  StateFinishAssignment  is 

“°jrnMA  d??ot  ledge)^  L  astO  fLeftA  rg, 

LastOfRightArg), 

endDojoins; 

hold(RegA.load): 

end  StateFinishAssignment, 

S 

Continuing  with  the  above  exomPje  system  is  ^ ™ 

implementation  shown  in  Figure  2-2-  Hw  both  operands  of  the 

S,^rn»us“”T»o  oTSS^  lnX° two  sepnnete  Ws  cen  be  eomblned  Into  one  e  . 
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can  the  ‘shift-a'  and  ‘shift-b’  P^^k^toTaSn 

state-machine  is  o^iiTthe  computaUon.  However,  b  cannot,  and  it  mujit 

^-e  in  the  last  stage. 
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Figure  2-2:  Two  implementations  of  a:=  b*2-4*a*q 

■  s-=23r;.-^:s?:«'?  a»Hi:S*s: 

hardware.  M  ultlpHcation /division  by  pwem  directly  as 

implemented  as  shifts.  because  it 

combinational  circuits.  Ex^nenti  ,  .  ,  ’  alternative  strategy  for 

is.lo  'iS'olv'e 

regularity  of  usage  is  required  in  this  context.  ,  ,  , 

2.  Common-Subexpression  Identification:  ^^‘L|®j^g°Qj\jJ^g^‘^rogram  involves  either 
rS<^™"'wo™“ir.r  some  kled  of  -lodireol"  Irensformolion  of  Ihe 
source  to  incorporate  the  results  of  the  data  flow  analysis. 

3.  Tempomry  Slorege  D  el,m.in.U^  Tto  eo^s  ^  f  t"' "rn'Slg 

for  storing  intemediate  Data  i,  useful  in 

opUmization  of  tempora^  used  in  earlier  parts  of  the  machine  can  be 

indicating  if  registers  (or  A  .^at  there  is  no  a  priori  upper  bound 

reusedHerewehavean  added  advantagemthnttoerej^ 

on  the  number  of  sudi  storage  units,  and  they  are  noi  re 
certain  specific  data  types. 
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ZZZ.  Conditional  Statements 

The  general  form  of  conditional  statements  is  as  follows. 


<Condi tionalStatement>  :=  ’if’  <Condition>  ’then’  <Statement> 

[’else’  <Statement>] 


The  madiine  that  performs  an  if  statement's  function  consists  of  a  state  (or  a  set  of  states)  to 
evaluate  the  <Condition>  part.  This  state  returns  a  boolean  value,  depending  on  whidi  the 
madiine  makes  a  transition  to  one  of  two  states  that  are  the  start  states  for  the  two 
<Statement>’s.  If  there  is  no  else  clause,  one  of  the  brandies  makes  a  transition  directly  to  the 
statement  following  the  if  statement.  This  is  shown  in  the  figure  below  (Figure  2-3). 


1  Eva1 

Ack^^alit^ 

- ; - 1  ~ 

ConT 

[  Do  Stid  2 

1  Do  Stmt  1 

J 

Exit 

Cond  } 

Figure  2-3:  Skeletal  state  machine  for  an  if  statement 

In  addition  to  guiding  the  transformation  of  assignment  statements,  inferences  from  global 
analysis  as  are  used  to  determine  the  presence  of  redundant  boolean  variables  in  the  source 
program.  Such  variables  are  then  replaced  by  just  tlie  output  line  from  some  combinational 
drcuitry. 


2i!.3.  Loop  Statements 

Ada  provides  for  both  simple,  unconditional  loops  as  well  as  while,  for.  and  until  loops.  A 
construct  of  the  form 


’loop’  <Sequence0fStatement8>  ; 

is  implemented  as  the  set  of  states  that  execute  the  <SequenceOfStatements>,  followed  by  a 
direct  transition  to  ^e  first  state  in  the  <SequenceOfStatemenls>.  Any  "exit"  statements 
inside  the  <SequenceOfStatements>  translate  to  trfinsitions  to  the  state  immediately  following 
the  loop. 

W  e  indicate  in  the  next  section  how  such  constructs  may  be  used  for  the  synthesis  of  systolic 
diips. 

For  while-loops  of  the  form 

’while’  <Condition>  ’loop’  <Sequence0fStatement8>  ; 

the  transformation  is  similar.  with  the  exception  that  the  states  for 
<  SequenceO  fStater!ients>  are  preceded  by  states  similar  to  those  for  a  conditional  statement 
(without  an  else  dause),  and  the  last  state  in  <  SequenceO fStatements>  is  followed  by  an 
unconditional  transition  back  to  the  states  for  evaluating  the  condition. 

For  constructs  wherein  the  loop  consists  only  of  a  select  statement,  (many  task  bodies  fall 
into  this  category,)  the  loop  can  be  replaced  by  a  single  state  where  the  madiine  waits  until  It 
receives  a  signal  from  any  of  the  modules  that  call  the  corresponding  accept  statements.  It  then 
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makes  n  transition  to  the  appropriate  set  of  states,  peiforms  the  required  computaUon.  and 
returns  to  the  "wait"  state. 


9.9.4.  PiDcedura  Calls 

Procedure  calls  are  directly  implemented  using  Request/Acknowledge  commumc^on 
between  the  caller  and  the  state  machine  that  implements  the  procedure.  The  state 
first  loads  the  parameters  of  the  procedure  on  the  bus/lines  to  the  called  machine,  and  men 
issues  a  request  to  it.  Alternatively,  a  "lazy  evaluation"  kind  of  scheme  may  be  used,  where  me 
parameters  are  evaluated  by  the  caller  only  when  needed  by  me  called  module  (in  r^onse  to  a 
demand  from  it.)  After  the  caller  receives  me  acknowledge  from  the  function  module  (whit* 
implies  that  the  output  data  line(s)  from  it  are  valid)  it  makes  a  transition  to  the  next  smte. 

Global  analysis  is  used  to  obtain  information  sudi  as  the  following; 

1.  Whemer  it  is  useful  to  implement  the  function  "in  line".  This  saves  some 
communication  overhead  at  me  expense  of  increased  silicon  area.  In  effect  sum  an 
arrangement  provides  a  private  copy  of  me  procedure  to  every  cailer.  In  VLSI  we 
have  the  added  advantage  of  not  being  restricted  to  a  universal  scheme.  Some 
procedures  can  be  implemented  in^ine  while  others  may  be  centrally  shared 
modules.  An  even  more  general  solution  provides  some  callers  (  depending  on 
estimated /measured  frequency  of  use  )  with  private  copies  of  the  function,  while 
others  share  a  common  unit. 

2.  IdenUfication  of  globals  accessed  in  the  procedure  body.  This  involves  deciding  on 
appropriate  communication  protocols  and  routing  considerations. 


2.3.  OpUmizatioii  . 

npiimiratinns  of  a  design  are  possible  at  all  of  the  levels  in  me  design  hieraraiy: 

-At  me  very  lowest  level,  it  is'  possible  to  increase  system  performance  by 
redesigning  individual  transistor  layouts  (e.g.  dianging  W  idth/Length  ratios)  to 
increase  speed  etc 

-At  a  somewhat  higher  level,  performance  improvements  can  be  obtained  by  using 
specialized  circuits  to  achieve  certain  funcUons  instead  of  using  a  standard  cell  set 

—At  me  next  level,  symbolic  version  of  layouts  can  be  locally  "manipulated"  in  order 
to  improve  efficiency  e.g.,  this  may  involve  swapping  adjacent  columns  (or  rows)  of 
PPLs  etc,  while  ensuring  that  logical  function  is  not  impaired. 

-At  me  state  raadiine  level,  performance  improvement  can  affected  by  state 
minimization,  improved  parallelism,  etc 

-Finally,  me  high  level  architecture  of  me  implementation  can  be  juggled  in  order  to 
improve  performance,  while  maintaining  consistency  with  the  the  abstract, 
rQiresentation  independent,  sped fi cations  of  me  problem. 


It  is  Important  to  note  mat  these  levels  have  rough  analogs  in  me  realm  of  stan^rd 
language  translation/machine  architecture;  faster/more  powerful  instruction  sets,  peephole 
optimization,  flow  analysis  on  inteiroediate  compiler  code,  and  algorithni  improvement 
Further,  me  overall  improvement  is  typically  greater  the  doser  the  optimizations  are  to  the 
initial  stages  of  development  of  an  implementation;  it  is  merefore  more  ai^antagMiu  to 
attempt  to  design  an  appropriate  ardiitecture  (/algorithm),  rather  than  spend  time  optimizing 

cfaannd  layouts. 


3.  SysloKc  Architectures  ^  j  » 

In  this  section  we  delineate  a  few  transformations  that  enable  the  synthesis  of  s^e  dasses  of 
systolic  designs.  For  the  sake  of  brevity,  we  deal  here  only  wim  a  few  dasses  of  looping  and 
recursion  constructs.  The  methods  are  applicable  to  a  wider  dass  of  starting  joints,  and  me 
theoretical  basis  for  the  mechanical  synthesis  of  sudi  designs  (among  omere)  is  elatarated  ui»n 
in  [26].  Asa  consequence,  we  have  here  diosen  to  emphasize  example  of  syntactic  OMtimda 
mat  are  suitable  for  describing  such  algorithms  and  architectures,  rather  than  me  details  of  toe 
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synthesis  strategy  itself. 

The  primary  decompositions  possible  are  one  or  more  of  seqvmtial  coTnposition,  partdltl 
decomposition,  and  pipelining.  W  hich  decomposition  scheme  is  adopted  typically  depends  upon 
the  performance  criteria  desired,'  a  detailed  analysis  of  ■which  we  omit  here.  For  example, 
pipelining  improves  throughput,  while  parallel  processing  improves  both  throughput  and 
response  time  over  sequential  solutions.  Of  course,  the  response  time  is  very  much  dependent 
upon  the  algorithm  used  (i.e.,  upon  what  the  specific  decomposition  is,  what  the 
suboomputations  involved  are,  and  how  the  partial  results  are  combined),  and  to  a  lesser  extent 
upon  the  lower  level  circuit  implementation  strategies.  In  particular,  we  recall  that  as  a 
ccnsequence  of  wire  delays  being  the  dominant  factor  In  single  chip  implementations, 
asynchronous  implementation  strategies  are  preferable  in  order  not  to  slow  down  the  whole 
system  and  to  minimize  skewing  effects. 

W  e  now  discuss  examples  of  syntactic  macros  that  aid  the  representation  of  such 
decompositions. 


3.1.  Iteration 

Consider  the  loop  structure 

for  i  in  1  ..  N  loop 

x(i)  :=  F(x(l)) 
end  loop; 

A  possible  sequential  implementation  of  this  loop  structure  is  shown  in  Figure  3—1.  This 
implementation  consists  of  a  processing  element  (or  cell)  that  computes  the  lunction  F.  When 
the  stream  of  values  Xj . Xn  is  input  to  the  F-cell,  the  output  is  the  stream  F(xi), ....  F(xii). 

A  parallel  implementation  is  possible  if  the  computation  of  F  does  not  have  any  side  effects 
on  the  subsequent  computations  in  the  loop.  Such  an  implementation  can  use  N  instances  of  the 

same  F— cell,  input  the  vector  of  values  <xj . Xjj>  in  parallel,  and  output  the  vector  of  resuJte 

<F(xj) . F(x„)>  in  parallel.  The  i-th  instance  of  the  F-cell  thus  inputs  xi  and  outputs  F(X|). 

This  is  illustrated  in  Figure  3-2. 

W  hen  each  computation  through  the  loop  results  in  the  computation  of  a  partial  result  that  is 
"assembled  together"  in  the  subsequent  iterations,  a  pipelined  implementation  can  be 
generated. 

Thus,  if  we  consider 

for  i  in  1  ..  N  loop 
X  :=  F(x) 
and  loop; 

then  a  pipelined  implementation  using  N  instances  of  F-oells  is  shown  in  Figure  3-3. 

A  combination  of  one  or  more  of  these  techniques  can  obviously  be  employed  whenever 
needed. 


3.2.  Recursion 

Some  dasses  of  recursive  functions  (procedures)  can  also  be  mapped  into  ^  systdic 
implementations.  It  is  of  course  possible  to  first  apply  standard  recursion  to  iteration 
transformations  and  then  apply  the  techniques  discussed  here.  It  is  however  also  possible  to 
avoid  this  intermediate  step  in  several  cases.  As  an  example,  the  fc^rm  shown  below  can  be 
direcAly  transformed  into  eiUier  of  the  implementations  shown  in  figure  3-4. 

fonction  n*  tch  (*,  pi  itrlng)  Tetlirn  boo  I  tan  is 
begin 

if  a  =  nu  I  I 


ipr"” 


Figure  3-1:  Sequential  Implementation 

0  $  •••  0 


PM 


Figure  3-2:  Parallel  Implementation 


-Q'MZl'  ■  ■  ■ 


-MIh 


RESULT 


Figure  »-S:  Pipelined  Implementation 


tlien  if  p  =  nu  1  1 

then  retiirnttru*) 
else  returntt* 

dse  if  L«i»  (*>  ,  ,  p,,  gut.Lat  t  (p) )  t 

then  natchtnn.But.Laatts),  hm_d 


end  llaieh) 


luit  specific  interconnection  topologies  at  hand,  m  p 
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Blgh  Urwfhpul  S1«v  rmpm^  ‘ 

nUd  tt^Bf  Dal^jCliXBfliUar  AijMitfWBiaFrBftMri  * 


Fifure  3-4:  Implementations  generated  for  H  atch 


advantages  over  the  expanded /graphical  forms  in  pattern  matching  and  automated 
transformation  in  that  (1)  there  is  a  significant  decrease  in  complexity  in  doing  textual  pattern 
matching  over  doing  graphical  pattern  matching  (sublinear  vs.  quadratic  or  more);  (2)  the 
absence  of  global  .interdependency  of  subcomputations  in  the  iteration  body  is  explicit  and  d<^ 
not  have  to  be  inferred  by  global  data  flow  analysis;  (3)  performance  metrics  can  be  easily 
defined  over  such  succinct  representations:  this  facilitates  automated  complexity  computation, 
although  a  graphical  representation  (whidi  is  isomorphic)  typically  facilitates  human 
oomputati  on  /comp  rehensi  on. 


3.3.1.  Broadcasting 

Broadcasting  a  signal  to  a  set  of  ports  associated  with  some  collection  of  processing  elements 
is  stated  as 

Broadcast  (si gnal ,  Set_,of. Ports) 

For  example,  the  Set-  of-  Forts  may  be  a  collection  of  named  ports  of  an  array  of  similar 
processing  elements. 

Roug*'!y  speaking,  port  names  of  cells  may  be  viewed  as  entries  of  tasks  associated  with 
them.  Thus,  consider  a  H  ULTIPLY-  ADD-  CELL  that  accepts  has  3  inputs  (ports)  a,  b,  and  c, 
and  outpucs  a  single  value  a*b+  a  We  can  describe  a  linear  array  of 
MULTIPLY- ADD- CELL’S  which  is  useful  in  several  systolic  algorithms  for  matrix 
oomputati  on'*,  as 

flULTIPLYJlDD.L'ELLS:  array (1..N)  of  I1ULTIPLY_ADD_CELL; 

If  we  then  want  to  state  that  x  is  broadcast  to  the  N  input  ports  named  "a"  of  the  array  of 
processing  elements  M  ULTIPLY—  ADD  —  CELLS,  we  can  express  this  as 
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BroadcaetCx,  HULTIPLY_ADD_CELLS.a) 

I 

N  ote  that  this  is  identical  to  saying 

for  i  in  1..N  loop  CONNECT  (x, MULT  I PLY_ADD_CELLS ( i ). a)  end  loop; 

This  can  be  generalized  in  an  obvious  manner  to  more  complicated  cases,  including  one 
wherein  the  set  of  ports  is  computed  dynamically. 


3  4  Regular  Interconnection  Structures  and  Related  Operations  .  . . 

W  hen  a  set  of  processing  elements  have  regular  interconnecUons  with  their  neighbors,  it  ^ds 
comprehension  and  pattern  matching  if  the  "local"  und  "global"  pai^  of  ^e  interconnections 
are  stated  succinctly  (as  opposed  to  specifying  the  detailed  interconnections). 

'H  hile  the  components  of  an  architecture  is  described  by  the  set  of  interconneclioM  between 
the  hardware  modules  it  consists  of,  its  fimcUoning,  or  the  computational  details  of  an 
algorithm  tailored  to  it  involves  stating  how  input  data  streams  move  through  the  sjratem,  get 
operated  upon,  and  ultimately  emerge  as  output  streams.  W  e  now  give  examples  of  these  m 
some  stan^rd  settings. 


3.4.1.  Linear  Inteiconnectioiis  ,,  v  j 

A  pipelined  computation  in  linear  interconnection  of  a  set  of  cells  can  be  expressed  as  , 

Plpel IneCArray,  Direction,  BoundaryCondi tions, 

^  Set_of_0utput_Port8,  Set_of_Input_Port8_of_Adjacent_Cel I) 

where  Direction  is  either  left-to-right  or  right-to-left,  the  BoundaryConditions  state 
input  at  the  left  or  right  extreme  port  and  what  is  to  be  done  at  the  coiresponding  output,  and 
the  pair  of  sets  Set_  of_  Output- Ports  and  set-oMnput-ports  speafy  the  set  of 
complementary  port  names  that  detail  which  ports  of  adjacent  cells  are  interconnected. 

As  a  specific  example,  we  have 

Pipel  ine(nULTIPLY_ADD_CELLS,  LeftToRight,  0,  ,,  i 

^  nULTIPLY_ADD_CELL(i )  .c,  nULTIPLY_ADD_CELL(  i  +  1) .  a) 


Pinel inedlULTIPLY  ADD  CELLS,  LeftToRight,  0, 

^  nULTIPLY_ADD_CELL.c,  Right(HULTIPLY_ADD_CELL) .a' 

where  Right(M  ULTIPLY-  ADD-  CELL)  indicates  the  cell  to  the  right  of  the  current  cell  in  the 
linear  array. 

Such  constructs  can  be  generalized.  As  an  example,  we  next  consider  tree  interconnections. 


3A2.  Tree  Interconnections  . 

As  an  example,  we  give  the  skeletal  specification  of  the  operations  and  workings  rf  a 
"Dictionary  machine"  that  has  the  main  computation  performed  by  its  leaf  processors.  Note 
that  the  broadcasting  process  may  itself  be  defined  in  terms  of  a  task  (in  Ada). 


task  D I ct I onary 

entry  INSERT  (t!  I  in  KEYj  pi  in  RECORD)  j 
entry  DELETE  (kt  in  KEY); 
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Ell 


entry  SEARCH  (ici  in  KEY;  r;  out  RECORD); 
entry  UPDATE  (ici  in  KEY;  ri  in  RECORD); 
entry  niN.RECORO (ri  out  RECORD); 


tadL  body  Diet  lonary  is 

TREE)  B i nar yTr (0  I  c 1 1 onary S I za ,  Laa f Procaaa or , 

In t erna  i NodaOrocaaa or ) ; 

—  Thia  procaaaor  traa  iapiaBanta  tha  Dictionary 
--  Laa f Procaaaor  and  In t arna i Pr ocoa aor  ara  2 
--  typai  of  procaaaora  (“taak  typaa*)  thet  ara 
—  uaad  in  i ni t an t i a t i ny  tha  traa. 

Funct i onPor t ,  KayPort,  RacordPorti  Por«; 

--  FunctionPort  raprasanta  tha  phyaicai  iinaa 
--  that  activata  tha  function  invokad  and  tha 
--  iinaa  naadad  for  tha  raquaa t /ack nou i adga  protocc>i. 

—  KayPort  and  RacordPort  rapraaant  tha  phyaicai 
--  iinaa  aaaociatad  uith  k  and  r. 

—  Tha  aaaociation  batuaan  tha  iogicai  ports  and  phyaicai  porta 

—  ia  dataiiad  baiou.  Tha  ganarai  fora  of  thia  conatruct  ia 

—  REPP''SENT  (phya  i  ca  i -por  t-naaa,  f  unc  t  i  on-na  aa ,  par  ana  t  ar-naaa ) 

—  uh i  atataa  that  tha  * phya i ca i -por t -nana *  rapraaanta 

—  tha  "par ana t ar-nana *  aaaociatad  uith  * f unc t i on-nana * . 

—  Thia  tnabiaa  atatanant  of  t i na  Buitipiaxing  of  tha  iinaa. 

REPRECEK 7(KayPort,  INSERT,  k); 

REPRESENTfKayPort,  fllN.RECORO,  k); 

REPRESENTCRocordPort,  INSERT,  r) ; 

REPRESENTCRacordPort,  SEARCH,  r); 

REPRESENTCRacordPort,  UPDATE,  r); 

REPRESENTCRacordPort,  f1IN_REC0RD,  r); 


—  Tha  in tarconnact  iona  to  tha  giobai  porta  ara  daacribad  baiou 

CONNECT (Root  (TREE ) . ANSUER,  RacordPort); 

CONNECT (Root (TREE) .KayPort,  KayPort) ; 

CONNECTCRoot (TREE). FunctionPort,  FunctionPort); 


begin 

loop 

■elect 

accept  SEARCH  (k  I  in  KEY;  ri  out  RECORD)  do 

BroadcaatCk,  r,  Laa fa (TREE) . SEARCH)  ; 

—  daiay  0 ( i og (0 i c t i onaryS  iza) ) 

—  thia  ia  dona  by  naking  uaa 

—  of  tha  intarnai  noda  procaaaora. 
—  The  *Rnauar*  fron  tha  root  ia 

--  connactad  to  tha  giobai  port 
--  corraapond ing  to  r. 

end  SEARCH; 


end  aeleet; 
end  loop; 
end  0 ict ionary; 


tadc  type  Laafi''rocasscr  is 
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entry  INSERT  (let 
entry  SEARCH  (ti 


in  KEYj  p«  in  RECORD) j 
in  KEY) i 


end  L»«<Proc«»»opj 
tn*  Wy  l.««Ppoc...op 


ia 


L««<K«y«  KEYj 
LaafRacopd]  RECDRDj 


_ Th  I  ■  I »  'h*  I  oc«  I  lt«y  • 

_ Thl»  l»  <*>•  P«copd  In  tha 

_  ppoc«»»oP,  OP  •  ooda 

__  Indicating  that  thapa  la  no 
__  pacopd  at  thia  laat. 


»egin 

loop 


accept  KJY^)  qeF  INED  (Loca  I  Racopd) 

iin  Fatha^p.RNSUER(LocalRacopd), 


(apnoru 


end  aelect; 
end  loop) 

end  LaatPpocaaaoPi 


entry  DELETE  (Its  in  KEY)  j 

. . 

'  ComblnadRnaHapJ  ontRECQKUM 

;nd  IntapnalNodaPpocaasoPi 

taak  body  Intapna iNodaPpocaa.op  ia 

begin 

loop  . 

ecceptSERRCHdc.  in  KEY,  '* « ‘»«Vh)‘;°'' 
Bpoadcaat  (It,  Sona.SERRCH) , 

delay  1, 


end  SERRCH 


end  aelect, 
end  loop, 
end  Diet  lonapy , 


3.5.  Input  and  Output  of  Date  Strci^  computation  is  input  and  output  is  rf  great 

The  manner  in  which  the  data  Suednets  descriptions  of  such  da^ 

Importonce  In  designing  t  dilsl  purpose  in  siding  simulebons  mudi  the 

ir.™. 'd"J/?SSr.r  .eve,  drcult  simulsUons. 

Rt  arrayd*-**^  of  BITS, 
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INPUT(fi>  , 
INPUT(SKEU(fi,l))  , 


—  Reprasants 

—  Rppraaanta 

—  Hhara  R (i) 

fi(2) 

—  and  so  on. 


an  array  of  bits  Input  In  parailal 
a  rotatad  uavafront  of  bits 
la  input  at  tiaa  1, 

Is  I nput  at  tiaa  2,  ... 


Additional  timing  statements 
in  a  natural  manner  to  heln 

ardiitecturese.g.  [10]. 


may  similarly  be  incorporated.  These  forms  can  be  expanded 
aescTibe  the  operation  of  algorithms  tailored  to  specific 


3.6.  Distribution  of  Data.  A  Systolic  Stack  Implementation 

whieh  1  >  ®  u  developed  a  general  technique  for  aiding  such  dedsiona 


futn 

csu. 


P  0 


PUSH  Posh 

CAl  fn^^  cEw. 


^UJlf 

ceu. 


n 

iNsiZr 


I  i  i  i—  I  , 

--  -  - - - ^ - 1  C^«N7TtoO 


Figure  3-6:  A  Systolic  Stadc  Implementation 


tadc  Systo  I  icStack  is 

entry  INSERTCx:  in  Eianant), 
entry  DELETE (xj  outElapant)) 


—  Rdd 1 1 1 ona  i 
--  of  thasa  to 

—  typa  "Stack 


"saaantlc  annotations*  spnclfy  tha  bahavlor 
^ba  that  assoclatad  with  tha  abstract  data 
OP  i  t  thasa  hara  for  bravlty. 


end  Systo  I  IcStack; 


task  body  Systo I  I cS tack  in 


record 

Callflrray.  nrray(i..N)  ofPushCall 


end; 

Local  Intarconnactlons 


Connac t (PushCa I  I . SandLa  f t,  Laf  t (PushCa II). INSERT) ; 

—  this  Is  a  Right  To 
—  Laft  data  transfar  diractlon 

Connsct (PushCa I  I .Ga  tProaLaf  t,  Laft (PushCa I  I ) .DELETE) ; 

"  this  is  a  Laft  to  Right 
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_  Boundary  Conditions 

Conn.ctCINSERT,  I ' Array CJ) • INSERT) , 
ConnacttOELETE,  Cal  l Array (N ). DELETE) , 

Conn.ct  (Call  Array  (D.CatFro.L.M  UNDEFINED), 

ConnacttCal  inrrayCD.GatFrosLatt,  , 

begin 

loop 

end  INSERT, 

end  DELETE, 
end  nelect, 
end  loop, 

end  Systol IcStack, 


entry  INSERT  (xt  inElanant), 
enttr  delete (xt  out  E  I  aaant ) , 
enSr  SandLatttx.  outElasant), 
enS^  GatLatt (x:  inElaaant), 


task  body  PuahCal I  is 

CurrantElaaant :  Elasant, 


begin 


accept  INSERT  (xt  ^ ^  (Cupp.p  tE  I  aaan  t )) , 

OutputToLafKSandLafM^^^^^^^^^  cuppant  contanta 


to  latt  nalghbor 
„  Both  thaaa  oparatlona  can  ba 
__  dona  In  1  cycia  using  a 
2-phasa  clockad  tllp-tlop 


Cupr«ntEl«**nti  =  x} 


end  INSERT ; 


ll'ua  o.lt  isplasantatlons  tor  tha  othar  ports. 


end  PushCa I  I , 


be  w“fom.»i  10  yioW  sjelol  0  ''fSll  deSTbTriolo?  mllroonnodlon 

The  presence  of  such  forms  and  also  reduces  greatly  the  amount  of 

M."^^o?h»trpSord,bo..erto.d>,.ve.he»men.u.te, 

Tbo  demons  o,  U,e  tr.ne.ono...oo  e.oU,,  .00  .  P.ooe  .oner.,  oloes  C  spe.n»Uo„e.  obd  e 


L  L  I 


» 
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mathematical  basis  for  supporting  sudi  automated  synthesis  may  be  found  in  [26],  The 
discussion  there  also  elaborates  on  how  the  performance  criteria,  cost  metrics  and  technology 
(  constraints  affect  the  synthesis  strategies. 


4.  Conclusions 

W  e  have  detailed  in  the  preceding  sections  the  structure  of  an  automated  transformation 
system  geared  to  aid  in  designing  systems  that  consist  of  a  mixture  of  software  cx}mpunents  and 
special  purpose  ([VLSI)  hardware  components.  In  particular,  we  have  indicated  the  mapping  of 
various  syntactic  constructs  in  Ada  into  hardware  structures,  and  some  other  high  level 
constructs  into  systolic  implementations.  It  is  intended  that  these  transformation  tools  be 
based  on  the  theoretical  framework  developed  in  [2Cl,  and  therefore  produca  designs  that  are 
formally  verifiable. 

An  additional  contribution  has  been  to  delineate  syntactic  forms  that  aid  sucxinch 
I  descriptions  of  special  purpose  architectures  and  algorithms  tailored  to  them.  The  design  of 

such  constructs  has  been  done  to  aid  direct  mapping  into  cireuit  layouts,  and  to  reduce  the 
complexity  of  pattern  matching  involved  in  the  transformation  process.  Such  forms  may  be  in 
fact,  be  viewed  as  "macrcjs",  since  they  may  be  elaborated  using  the  existing  set  of  Ada 
primitives.  Unfortunately,  however,  the  resulting  expansions  are  sometimes  quite  ctumsy  and 
obfusceting;  on  the  other  hand,  a  potential  use  of  these  expamnons  is  in  simulation  of  the 
resulting  hardware  using  commercially  available  compilers  for  A  da. 

^  Finally,  we  have  summarized  some  cjf  the  results  of  our  preliminary  empiricol  explorations  in 

using  the  transformation/synthesis  methodology.  The  examplcis  exjnsidered  inctuded  various 
ctasses  of  systolic  algorithms  and  the  hardware  implementation  of  an  Ada  program  fragment 
using  "path  prcjgrammable  logic*'  [20,  14].  Our  preliminaiy  results  have  been  quite 
encx}uraging,  and  have  served  to  emphasize  the  importance  of  performance  characiteristics  in 
determining  the  global  synthesis  strategy.  It  has  been  estimated  that  the  trade-off  in  using  the 
I  latter  methodology  for  lew  level  VLSI  design  results  in  about  10-20%  increase  in  chip  area 

’  required  (when  compared  with  custom  layouts),  but  results  in  a  drastic  reduction  in  the  design 

time  (from  a  few  months  to  a  few  days)  [20]. 

Ackrurwledgemeriis.  W  e  gratefully  acknowledge  the  feedback  received  on  varicxis  aspects  of 
this  work  from  our  cx}ileagues  in  the  "A da— tcj— Silicxjn  project",  particiularly  EllicAt  Organick, 
Tony  Carter,  A1  Davis,  Alan  Hayes  and  Gary  Lindstrom.  Special  thanks  go  to 

5. Purushcjthaman  for  porting  the  transformatiem  system  to  run  on  the  Vax. 
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Appendix 


1.  Haidware  Implementation  of  a  paitof  the  Internet ProtocoL  A  C°se  Study 

In  this  appendlxfwe  summarize  the  results  of  applying  the  methodology  detailed 
transformation  of  a  fragment  of  an  Ada  program  specifying  the  Departaent 
Protocol  [16]  into  a  hardware  implementation.  The  Internet  Protocol  (hencefo^rth  referred  to  as 
IP)  to  a  communication  protocol  designed  to  enable  packets  to  be  toa^ferred 
The  funcUon  of  the  particular  module  that  we  consider  here  (called  Read^  nn 

to  read  in  the  initialization  parameters  from  the Jlemonr  Unit  to  an 

acknowledcement  to  the  caller  when  it  is  done.  The  procedure  shown  below  (Figure  4-1)  is  a 
general  prLedure  that  achieves  this  while  admitting  a  great  deal  of  flexibility  in  the  sizes  of 
various  parameters. 

Generation  of  the  Circuit  for  Read-  Init_  Parameters 

The  Ada  program  shown  above  is  transformed  using  the  methodology  ouUin^ 
the  most  part,  it  corresponds  to  a  direct  application  of  the  strategies  ouUined  in  2.  Some  of  the 
salient  features  resulUng  from  the  opUmizaUons  are  described  below. 

The  case  statement,  which  constitutes  the  major  portion  of 
part  of  the  first  loop  is  very  highly  specialized  in  that  it  simply  checks  toe  index  vanable  of  toe 
toop  and,  depending  on  its  value,  chooses  a  variable  that  is  loaded  ®  hv 

(octet-  register).  Asa  result  this  is  implemented  by  using  a  multiplexor  which  is  controlled  by 

the  Icxip  variable. 

Since  the  variable  "number-  of-  tos-  table-  octets"  is  the  product  of  two  variable 
"local  net-  type-  of-  service-  row-  size"  and  "number-  of-  local-  net-  types-  of-  service  , 
and  is'never  ured  except  in  a  final  escape  clause  in  the  second  loop,  we  use  two  nested  loops  and 
do  away  with  the  multiplication  altogether. 

The  final  target  code  is  shown  below.  A  symbolic  descripUon  of  the  circuit  obtained  fr^  ^ 
by  using  the  Assassin  program  [3]  and  laying  out  the  data  j. a  tos  is  also  shown  ^ere.  Ttos  torm 
of  the  drcuit  can  be  directly  transformed  into  a  set  of  masks  for  fabricabon.  an  instance  of 

which  is  also  shown. 


■eparate  (Ina_Ou t_nodu I • , Ina_Out ) 


procedure  R*ad_i n ) t_par«a* tart (raapona* I  out  out_raaponaa)  ia 
procedure  f1aaory_raquaa  t  ( 

chunl(_of_addraaa_foraali  chunK_of_addraaa_typaj 

do_ur I ta_faraa 1 1  booiaan; 

octa  t_f (traa  I  I  out  octat_typa) 

renamea  flaaory.Raquaat; 


octat.raglatari  octat; 

begin 

—  Oownioad  tha  6  Individual  initialization  paraaatara. 

for  I  ndax  in  1  .  .  8 
loop 

naaory_raquaBt ( 

I aquaat_typa_forBai 
chunl(_of_addraaB_foraal 
octat.foraa I 


case  ilndax 

ia 

whiioi  1 

=  > 

wlif.in  2 

=  > 

vlvtn  3 

=  > 

when  4 

=  > 

when  5 

=  > 

when  6 

=  > 

when  7 

=  > 

when  8 

=  > 

=  >  raca  I  va_datua_oc  ta  t , 
=  >  don’ t_cara_X_datua, 
=  >  oc  t  a  t_rag  I  at  ar)  I 


I  nai_addr  aaa_i  ang  t  h 
inn_t i■a_out (6) 


oc  ta  t_rag I  a  tar; 
octat.ragiatar; 
oc  ta  t_rag I  a  tar  ; 
oc  t a  t_r ag  la  tar ; 
oc  t a  t_r ag I  a  tar ; 

-  ..  oc  t  a  t_r  ag  I  a  tar ; 

I  oca i_nat_typa_of_aarvica  .tabla_roH_ai za 

!=octat_raglatar; 
w nen  8  =>  nuabar_of_)ocal_nat_typaa_of_aarvica 

1=  octat_ragiBtar| 


andcaaa; 

end  loop; 


nuabar_o f_tof _tab I a_oc ta ta  i= 

iocal_nat_typa_of..aarvica_tabia_roH_8iza  a 

nuBbar.o f_i oca i_nat_typaa_of_Barvica; 

for  Ind^x  in  1  . .  nunbar_of_toa_tabia_octata 
loop 

naaory_raquaat ( 

raquaat_typa_foraa  I  =>  r ac a  I  va_da  tUB_oc ta  t , 
chunl(_of_addraaa_f oraal  =>  don’ t_cara_X_datuB, 
octat_forBal  =>  t  OB_t  ab  I  a  ( I  ndax) ) ; 

end  loop; 

end  Raad_i n I t_paraBa t ara; 


Figure  4—1:  Source  Program  for  Read-  Init-  Parameters 
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vith  Trans f orMa t lonGi na r I  Cl ,  NauBooiaan; 

procedure  RIPTargat  is 

task  RIPStart  is 
entry  RaqRIP; 
end  RIPS  tart ; 

tuk  body  RIPStart  ia 

package  InatlaxPacKatLow 
package  InatlaxPaclcatH  i  gh 
package  InnRddraaaLangth 
package  innTiaaOutLou 
pp;kege  InaTInaOutH Igh 
package  inmRcicTypa 
package  InmTOSTab i aRowS I za 
package  NoOfLocNatTOS 
package  TOSS i zaCountarPra  i  laRag 
package  TOSEntryCountar 
package  EntryDona 
package  TOSOona 
package  LooplOacodar 
package  TypaOfSarvicaTab  la  ia  new 

package  TOSRddrassRag i a  tar  ia  new 

package  Con tro  i  Un i t  ia 

task  RIPStatal  ia 

entry  )1ovo2; 
end  RIPStatal; 

taak  RIPStata2  ia 

entry  tlovaS; 
end  RIPStata2; 

taak  RIPStataS  ia 
entry  tlovsl; 
entry  tlo va4 ; 
end  RIPStataS; 

taak  RIPStata4  ia 
entry  flovaS; 
end  RIPStata4; 

taak  RIPStataS  ia 

entry  tlovaS; 
end  RIPStataS; 

taak  RIPStataS  ia 

entry  flo va7 ; 
end  RIPStataS; 

taak  RIPStata?  ia 
entry  tlovaS; 
entry  tlovaS; 
end  RIPStata?; 

taak  RIPStataS  ia 
entry  tlovaS; 
entry  tIovaSTRT ; 
end  RIPStataS; 


is 

new 

Rag  i  a  tar (a  i  za  =  > 

S)  ; 

ia 

new 

RaglatarCaiza  => 

S)  ; 

ia 

new 

Rag istar (a iza  =  > 

S)  ; 

ia 

new 

Rag i a  tar (a i za  =  > 

S)  ; 

ia 

new 

Rag  iatarCaiza  => 

S)  ; 

ia 

new 

Rag  1  a  tar (a  i  za  =  > 

S); 

ia 

new 

Rag i atar (a i za  =  > 

S)  ; 

ia 

new 

Rag  i  s  tar  (a  i  za  =  > 

S)  ; 

ia 

new 

Rag  1  a  tar (a i za  =  > 

S)  ; 

ia 

new 

Rag  1  star (a  i  za  =  > 

S); 

ia 

new 

EqConparatorCaiza 

=  > 

S); 

ia 

new 

EqConpara  tor (a i za 

=  > 

S); 

ia 

new 

EnOacodar (Inpu tS 1 za  - 

>  3); 

RRtI  (RddraaaS  i  za  =>  S, 

UordS  Iza  =  >  4) I 
C  i  r  IncRag  i  a  t  ar  (a  I  za  =>  S)  ; 


taak  body  RIPStatal  ia 
begin 

accept  tlo  va2  ( )  do 

aovaConCtlaaoryRaquaat.RcIc),  to(RIPStataS2)); 
end  tlova2; 


T2T 


) 


I 


I 

( 
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ho  I  d  (doMRoq)  ; 
end  RIPStatol; 

taek  body  RIPStato2  in 
begin 

accept  do  va3  ( )  do 

aovo(on (NIL) ,  t o (R IPS t • t *3 ) ) ; 
end  l1ovo3) 
rosot  (llonRoq)  ; 
hold(LooplDocodor.Enabla) 

(snd  RIPStata2) 

taak  body  RIPStata3  ia 
begin 
aelect 

accqit  do  val  ( )  do 

nova(on(  N OT (danor yRaquaa t . Rck ) )  RND 
NOT (DacodarCountar. Carry))) , 

to  (RIPS  fatal) ) I 

end  doval) 
accept  do  va4  ( )  do 

aova(on(  NOT (danor yRaquaa t . Rck )  RND  DacodarCoun t ar . Carry ) 

to(RIPStata4))) 

end  dova4) 

ho  Id  (TOSS IzaCountarPral InRaq. Inc) i 
end  RIPStata3) 


I 

« 

I  1 

;l  ' 


t  S 

i  I 


taak  body  RIPStata4  ia 
begin 

accept  do vaS ( )  do 

aova (on(NIL) ,  t o (R  IPS t a taS ) ) ; 
end  dovaS; 

ho  Id (TOSS IzaCountarPral I nRag.C I r); 
hoid(TOSEntryCouvntar.Clr)) 
hold (TOSRddraaaRag i a t ar . C  I r) ; 
end  RIPStata4( 

taak  body  RIPStataS  ia 
begin 

accept  do /aS  ( )  do 

aova(on (daaoryRaquaat . Rck) ,  t o  (R IPS t a t aS) ) ; 
end  dovaS) 
hold (daaRaq ) ; 
end  RIPStataS) 

taak  body  RIPStataS  ia 
begin 

accept dova7 ( )  do 

aova(on(NIL) ,  t c (R  IPS t a t a7 ) ) j 
end  dova7; 
raaat  (daaRaq) ; 

hold(TypaOfSarvlcaTabla.Urlta) 
end  RIPStataS; 

taak  body  RIPStata7  ia 
begin 

cccqit  dovaS ()  do 

nova (on (NIL) ,  t  o  (R  IPS t a t aS)  ) ; 
end  dovaS; 

hold(TOSSIzaCountarPrallaRag.Inc); 
ho)d(TCSEntryCountar.  Inc); 
hold(TOSRddraaaRaglttar.Inc); 
rad  RIPStata7; 

taak  body  RIPStataS  ia 
begin 
aelect 

accept  do  vaS  ( )  do 

aova(on(N0T(T0SDona)),  t o  (R IPS t a t aS ) ) ; 
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end  riovtSi 

Bccqit  Do  vtSTRT  ( )  do 

do va  (o n  (TOSOont )  ,  t o (R I  PS t ar t ))  ; 
end  Flo vaSTRT I 

hold(T0SSIxaCountarPraii«iRa9.Clr)| 
hold(TOSEntryCountar.  Inc)  | 
end  RlPStataS; 

end  ControlUnI t; 


begin  —  body  of  task  RIPStart 
accept  R  a  q  R I P  do 

■  ovaton  ( InaSarvar . Raquas t > ,  to  (R IPS t a t aS 1) ) ; 
end  R  a  q  R I P  I 

hoid(TDSSixaCountarPraliaRay.Cir)| 
end  RIPStart; 


begin  —  Body  of  procadura  RIPTarqat,  spac I f i c a t I o n  of 
--  i n t ar 0 onna 0 t i ons 


CONNECT  (namoryRaquast.Dutput(8..7),  Typa0fSarvlcaTabla.Input(8..7)); 
CONNECT  (riadoryRaquast. Output (8. .7),  IndnaxPackatLou.0ata(8..7)); 
CONNECT  (nanoryRaquast.0utput(0..7),  InnF1axPackatHl9h.0ata(O..7))| 
CONNECT  (riaffloryRaquast. Output (8. .7),  InnRddressLan9th.0ata(8..7))| 
CONNECT  (flado ryRaquas  t .  Ou  t pu  t  (8. .  7  > ,  I n«T  I  aaOu  t Lou .  Oa  t a  (8 .  .  7 ) ) ; 
CONNECT  (nadoryRaquast. Output  (8. .7),  InnTi«aOutHl9h.Oata(8.  .7))  ; 
CONNECT  ( Flador  yRaquas  t .  Ou  t  pu  t  ( 8 . .  7  >  ,  I  ndPck  Ty  pa  .  Oa  t  a  (8.  .  7 ) )  ; 

CONNECT  (namoryRaquast.0utput(6..7),  IntiT0STablaRouSlxa.0ata(6..7))| 
CONNECT  (flanory Raquas  t .  Ou  t  pu  t  ( 8.  .  7  ) ,  NoO  f  LocNa  t  TOS .  Oa  t  a  (8. .  7)  ) ; 

CONNECT  (Entry0ona.Inputl(6..7),  ln«TosTtiblaRouSlza.0ata(8..7))| 
CONNECT  (Entry0ona.Input2(0..7),  TOSSIzaCountarPral lnRa9.0ata(0..7))| 

CONNECT  (T0S0ona.Inputl(0..7),  NoO f LooNa t TOS . Oa t a (8 . . 7) )  ;' 

CONNECT  ( TOSO ona .  Inpu 1 1 (8 . . 7)  ,  TOS En t r y Coun t ar . Oa t a  (8 . . 7) ) ; 

CONNECT  ( Typa 0 f Sar V i oaTab i a . Rddras s  (8. . 7) ,  TOSRddra ssRa 9  i  s t ar ) ; 

CONNECT  (LooplOaoodar. Input  (8.  .2) , 

TOSSIzaCountarPral  ll■Ra9.0ata(8. .2)); 

CONNECT  (Loop  lOaoodar .  Ou  t  pu  t  ( 8 ) ,  I  maFlaxPaok a  t  Lou .  Lo  ad ) ; 

CONNECT  ( Loop  lOaoodar  .  Out  pu  t  ( 1 ) ,  I  nnfla xPaok a  t  H  I  9h .  Load)  ; 

CONNECT  (LooplOacodar. Output  (2) ,  InmRddrassLan9th. Load) ; 

CONNECT  (Loop lOacodar . Ou t pu t  (3 ) ,  I nnT  I  aaOu t Lou . Load ) ; 

CONNECT  (Loop  lOacodar  .  Ou  t  pu  t  (4  ) ,  I  naiT  I  naOu  t  H  I  9h . Load ) ; 

CONNECT  (LooplOacodar. Output (5) ,  I n«Rc kTypa . Lo ad) ; 

CONNECT  (Loop lOacodar . Ou t pu t (6 ) ,  I ntiTO STab I aRouS I za . Loa d) ; 

CONNECT  ( Loop lOacodar . Ou t pu t  (7 ) ,  NoO f LocNa t TOS . Load ) ; 


end  R IPTar9a  t ; 
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ABSTRACT 


This  report  explores  the  contention  that  a  high-order  language  specification  of 
a  machine  (such  as  an  Ada  program)  can  be  methodically  transfonned  into  a 
hardware  representation  of  that  machine .  One  series  of  well-defined  steps 
through  which  such  transformations  can  take  place  is  presented  in  this  initial 
study. 

The  general  method  consists  of  a  two-fold  strategy: 


1.  Transform  the  high-level  specification  into  a  network  of  inter¬ 
communicating  "state  machine/data  path  pairs". 

2.  Through  a  catalogue  method,  map  each  state  machine  /  data  path  pair 
into  a  circuit  realization. 


Four  representational  levels  are  utilized  in  the  transformation  process.  Each 
inter-level  transformation  is  discussed.  The  four  levels  are: 


1.  Ada  specification  of  the  algorithm. 

2.  Machine-description  specification  of  the  algorithm,  consisting  of  a 
control  part  and  a  data  part.  This  version  is  expressed  in  a 
stylized  dialect  of  Ada  developed  for  this  study. 

p.  Protocol-definition  specification  of  the  algorithm,  obtained  by 
inserting  constructs  that  define  inter-program  unit  communication. 

4.  Storage/Logic  Array  (SLA)  specification  of  the  algorithm,  which  can 
be  mapped  directly  to,  and  are  regarded  as  equivalent  to,  circuit 
representations . 

Tne  transformation  strategy  relies  upon  exploiting  a  one-to-one  correspondence 
between  Ada  instantiations  of  generic  packages  introduced  in  the  level  2 
representation  and  SLA  "modules",  which  are  composed  of  primitive  SLA  cells 
introduced  at  level  4. 


The  transformation  methodology  described  in  the  paper  has  been  demonstrated  for 
a  non-trivial  Ada  program  example. 


1 .  Introduction 

This  report  reviews  elementary  principles  applicable  for  methodically 
transforming  a  high-order  language  specification  of  a  machine,  such  as  an  Ada 
program,  into  a  hardware  representation  of  that  machine.  In  this  initial  study, 
we  discuss  one  series  of  well-defined  steps  through  which  such  transformations 
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can  take  place . 

Research  on  automating  Ada-to-Silicon  transformations  is  currently  underway 
at  the  University  of  Utah  [9].  In  this  report,  which  does  not  attempt  to 
document  the  specifics  of  the  mainstream  of  that  research,  we  outline  a  series 
of  mappings  for  transforming  individual  Ada  program  units  to  equivalent 

integrated  circuits.  Our  emphasis  is  on  the  feasibility  of  these 

transformations  and  is  not  concerned  with  finding  a  series  of  optimal 

transformation  steps.  Our  purpose  is  to: 

1.  Demonstrate  one  (relatively  straightforward)  approach  by  which  an  Ada 
program  can  be  mapped  into  a  specification  of  an  integrated  circuit 
(IC)  through  adherence  to  rule-based  techniques. 

2.  Examine  the  pros  and  cons  inherent  in  the  most  straightforward, 

unoptimized  approach. 

The  method  presented  follows  the  general  transformation  strategy  suggested 
earlier  [6j.  The  essence  of  this  strategy  is  to  represent  each  Ada  program  unit 

as  a  synchronous  stored  state  machine  part  and  a  data  path  part.  Circuits 

derived  by  following  this  approach  have  the  general  form  pictured  in  Figure  1-1. 

The  pairing  of  a  state  machine  and  a  data  path  (i.e..  an  environment)  is 

referred  to  as  an  "engine".  The  hardware  realization  of  an  entire  Ada  program, 
or  of  any  subset  of  program  units  of  that  program,  is  actually  a  network  of 
asynchronously  intercommunicating  engines,  each  having  the  form  outlined  in 
Figure  1-1.  For  the  convenience  of  this  report,  individual  Ada  tasks  are 
considered  to  be  program  units. 

A  transformation  methodology  is  just  begi.nning  to  be  explored  L,11].  There  is 
need  to  develop  a  well-defined  set  of  rules  through  which  such  transformations 
can  eventually  become  a  mechanical  process.  Some  guidelines  that  distinguish  a 
set  of  rules  as  having  the  potential  I'or  eventual  autexnation  have  been  suggested 
[10]. 


»**«*»**»*»***«««  control 

*  State  Machine  * - 

*  Part  * 

«««««««»»««««<««« 


feedback 


Input 

I 

I 

V 

tttl^tttlc******* 

*  Local  * 

>*  Environment  * 

*  Part  • 

I  I 

I  I 

1  V 

-  Output 


Figure  1-1:  An  Engine  and  Its  Two  Principal  Components 


The  transformations  presented  here  are  considered  to  be  extensions  of  those 
originally  outlined  in  the  following  sense: 


1.  Not  only  is  the  high-level  specification  of  a  program  unit  expressed 
in  Ada;  intermediate  levels  of  representation  are  also  expressed  in 
Ada.  "Machine-description"  and  "Protocol-definition"  styles  of  Ada 
programming  are  proposed  to  express  intermediate  transformation 
steps,  permitting  the  algorithmic  behavior  to  be  checked  through  Ada 
program  execution  at  all  intermediate  levels  as  well  as  the  top 
level . 

2.  NMOS  Storage  Logic  Array  (SLA)  technology  [15]  [1^]  is  chosen  for  the 
low-level  realization  of  the  machine.  (More  practical  versions  of 
SLAs,  called  PPLs  have  been  developed  to  serve  as  a  target  for  this 
transformation  process  [9].)  SLA  "modules"  give  us  a  set  of  building 
blocks  that  fit  the  specific  needs  of  this  method.  Utilization  of 
other  semi-custom  integrated  circuit  components  offers  an  opportunity 
for  enrichment  of  this  methodology  into  the  VLSI  range. 


A  nigh-order  language  Ada  program  is  transformed  in  three  steps  to  reach  the 
level  of  representation  from  which  integrated  circuits  may  be  produced  directly. 
In  this  report,  the  four  levels,  counting  the  starting  level,  are  called 
"stages".  These  stages  are: 


1.  High-level  Ada  program 

2.  Machine-description  Ada  program 

3.  Protocol-definition  Ada  program 


^ .  NMOS  SLA  program  or  equivalent 
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Characteristics  of  these  stages  and  rules  that  guide  the  transformations 
between  them  are  presented  in  succeeding  sections.  A  case  study  that  was 
performed  following  this  method  on  a  non-trivial  Ada  program  is  presented 
elsewnere  [6]. 

[we  again  stress  that  circuit  optimization  (space  or  speed)  is  not  a  goal 
addressed  in  this  paper.  Thus,  in  situations  where  performance  or  circuit  area 
or  both  are  critical,  the  approach  presented  is  unlikely  to  yield  circuits  with 
characteristics  that  are  competitive  with  those  produced  by  more  custom  methods, 
especially  for  many  important,  but  special  algorithms,  e.g.,  those  that  lead  to 
compact  systolic  arrays.] 

2.  Stage  1:  High-Level  Ada  Program 

The  machines  specified  and  realized  by  our  transformation  process  are  viewed 
as  ensembles  of  interacting  state  machine/environraent  pairs  (engines).  The 
programming  language  Ada  is  well-suited  for  specifying  such  pairs.  Thus,  a 
strong  correlation  exists  between  data  abstractions  in  Ada  and  data  abstractions 
in  certain  views  of  integrated  circuits;  indeed  we  exploit  this  correlation. 

An  Ada  program  is  composed  of  one  or  more  program  units  [5]  [2j.  A  prog’'ara 
begins  execution  as  a  single  thread  of  control  in  the  main  subprogram,  but  can 
initiate  tasks,  each  of  which  nas  associated  with  it  a  separate  thread  of 
control.  A  program  unit  in  this  model  is  analogous  to  a  machine  that  is 
initiated  via  a  single  "Go"  button,  but  which  is  capable  of  delegating  work 
among  potentially  concurrent  sub-machines.  In  Ada,  such  sub-machines  take  the 
form  of  tasks  ■  Ada  also  offers  flexibility  and  control  in  specifying  the 
communication  between  program  units,  i.e.,  in  specifying  the  kind  of  interaction 
between  units.  Data  abstractions  represented  as  Ada  packages,  another  form  of 
program  unit,  are  also  transformable  into  individual  engines  whose  operators 
eitner  transform  given  instances  of  a  data  type  or  own  and  operate  on  individual 
instances.  Shifting  such  an  engine  from  idle  .to  a  particular  active  state 
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corresponds,  at  a  higher  level  of  abstraction,  to  the  activation  of  an  Ada 
pacicage  operation. 

Information  needed  to  represent  an  engine  can  be  extracted  from  an  Ada 
program  unit  for  use  in  representing  the  local  environment  (data  path)  and  the 
state  machine  (controller).  This  information  is  drawn  both  from  the 

specification  part  and  from  the  body  part  of  tha  program  unit  being  mapped  to 
the  next  stage . 

Stage  2  representation  elaborates  intra-program  unit  constructs  while  Stage  3 
elaborates  inter-program  unit  communication  constructs.  The  language  for  Stage 
2  is  a  stylized  but  legal  form  of  Ada. 

3.  Stage  2:  Machine-description-level  Ada  program 

3.1.  The  Role  of  Stage  2 

A  Stage  2  program  achieves  two  objectives: 

1.  Infers  a  collection  of  needed  hardware  modules  from  the  declaration 
part  of  the  program  unit  and  identifies  the  needed  modules  through 
instantiation  of  generic  packages. 

2.  Transforms  infix  expressions  represented  in  the  Stage  1  form  into 
prefix  form. 

The  distinction  between  the  control  flow  and  data  flow  of  a  program  is  sharpened 
by  the  transformation  from  Stage  1  to  Stage  2.  Thus,  in  its  Stage  2  form,  the 
program  takes  the  form  of  a  state  machine  and  the  data  path  it  controls.  The 
declarative  part  of  the  Stage  2  form  represents  a  collection  of  hardware  modules 
(a  "data  path")  inferred  from  the  declarative  part  of  the  Stage  1  form.  The 
body  part  of  the  Stage  2  form  represents  a  state  machine  whose  structure  is 
inferred  from  both  the  declarative  and  body  parts  of  the  Stage  1  form.  The 
Stage  2  language  style  has  two  distinguishing  features: 

-  extensive  use  of  generic  building  blocks 
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-  use  of  the  "engine  extension"  style  of  representing  states  and  state 
transitions 


The  terms  "building  block"  and  "module"  have  specific  meanings  below.  A 
"building  block"  refers  to  a  generic  package  instance  introduced  in  Stage  2  to 
model  a  particular  component  of  the  data  path.  A  "module"  refers  to  a 
collection  of  SLA  cells  from  which  the  full  circuit  will  be  constructed.  Every 
generic  package  instance  identified  in  the  Stage  2  representation  maps  to  a 
corresponding  Stage  SLA  module. 


3.2.  Stage  2  Examples 

Figure  3-1  is  an  example  of  a  generic  package  decj.aratlon  for  a  building 
block  representing  a  counter.  An  instantiation  of  this  package  (e.g.,  "package 
C  is  new  Counter")  corresponds  to  the  module's  "black  box"  representation  (see 
Figure  3-2).  The  SLA  program  that  corresponds  to  Figure  3-2  is  presented  in 
Figure  3-3. 

generic 

lo_value:  integer; 
hi_value:  integer; 

—  allows  one  to  instantiate 

—  counters  of  various  sizes 
package  Counter  is 

—  function: 

a  counter  with  load,  lookup, 
increment,  and  decrement  operations 
procedure  Load( 

load_value:  in  integer  ); 
procedure  increment ; 

—  Increment  by  1  is  implied, 
procedure  Decrement ; 

—  Decrement  by  1  is  implied, 
function  Lookup  return  integer; 

--  fieturns  tne  current  value. 

end  Counter; 

Figure  3-1:  Counter  Building  Block  Package  Specification 

With  a  few  exceptions  (to  be  discussed  below)  all  variables  and  operators  in 
the  Stage  1  program  unit  are  transformed  into,  instantiations  of  generic 


Figure  3-2:  "Black  Box"  Representation  of  a  Counter  Module 

packages.  The  Stage  2  code  is  then  restricted  to  describing  actions  through  the 
use  of  these  instantiated  packages.  Stage  1  to  Stage  2  transforaations  result 
in  code  that  is  composed  primarily  of  function  and  procedure  applications.  For 
example,  a  line  of  code  such  as 

A  :r  B  +  C; 

is  transformed  into 

A.Virite(Add  .Go(B.Read  ,  C.Read  ) ) ; 

wnere  A,  B,  C,  and  Add  are  previou.sly  instantiated  packages.  Thus,  if  the  Stage 
1  code  includes  the  object  declaration 

A,  B,  C:  integer; 

the  corresponding  Stage  2  form  would  exhibit  the  instantiations 

package  A  is  new  Register (word_length  =>  integer); 
package  B  is  new  Register (word_length  =>  integer); 
package  C  is  new  Register (word_length  =>  integer); 
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Figure  3-3:  SLA  Program  for  Counter  Module  Using  the  SCLED  Notation 

Furthermore,  encountering  "+"  vhile  parsing  Stage  1  code  would  lead  to  the 
inclusion  of 


package  Add  is  new  Adder; 

in  tne  corresponding  declarative  part  of  the  Stage  2  code.  Hence,  the  code 
presented  in  this  example  would  eventually  map  into  a  hardware  structure 
abstractly  presented  in  Figure  3-^. 


The  design  of  the  building  block  set  and  the  design  of  the  SLA  module  set 
must  be  coordinated.  As  a  possible  means  of  enforcing  the  design  discipline,  a 
Stage  2  programmer  is  provided  with  one  or  more  packages  that  specify  the  set  of 
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Figure  3-^:  Hardware  Realization  of  "A  :=  B  +  C;" 

generic  packages  available.  The  programmer  can  thereby  be  restricted  to 
expressing  algorithms  with  instantiations  and  use  cf  the  pre-defined  generic 
packages. 

3.3.  The  "Engine"  Extension  to  Ada 

The  body  part  of  a  Stage  2  program  is  sub-divided  into  states  denoted  by 
labels.  To  represent  the  mutually  independent  actions  that  can  occur  in  the 
same  state  of  a  state  machine  in  standard  Ada,  one  could  use  the  "verbose  form" 
that  declares  (and  then  initiates)  a  set  of  dynamically  created  tasks.  A  more 
succinct  equivalent  is  possible  if  we  were  to  include  an  "engine  extension"  for 
Ada  to  specify  a  similar  objective.  Used  at  Stage  2,  the  engine  extension 
allows  one  to  specify  a  sequence  of  Ada  statements  that  can  be  translated  into 
concurrent  actions. 

An  engine  clause  has  the  structure  illustrated  in  Figure  3-5.  Within  the 
scope  of  an  engine  clause,  the  sequence  of  statements  bounded  by  two  state 
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engine  Example  is 
begin 

<<State_Start>> 


initial  actions 
—  executed  in  parallel 


<<State_l>> 

<<State_2>> 

<<State_stop>> 
end  Example; 


—  actions  to  be 

—  executed  in  parallel 

—  another  set  c T  actions  which 

—  can  be  executed  in  parallel 

--  final  state 
null; 


Figure  3-5:  Structure  of  an  Engine  Clause  for  Representing  "Transition  Graph 

of  a  State  Machine 

labels,  e.g.,  <<State_1»  and  <<State_2»  above,  are  actions  that  can  occur  in 
parallel.  Execution  of  a  "goto"  statement  within  such  a  (labeled)  sequence 
terminates  the  actions  within  that  state  (i.e.,  triggers  a  state  transition). 
(To  enhance  readability,  we  follow  the  convention  that  the  first  node  of  every 
engine  clause  be  laoeled  "State_Start"  and  the  final  node  be  labeled 
"State_Stop" . ) 


westing  of  engines  clauses  follows  Ada  scoping  rules.  An  engine  may  be 
declared  local  to  another  engine  just  as  one  procedure  can  be  declared  local  to 
another  procedure.  Thus  a  local  "sub-engine"  may  be  called  from  its  containing 
"main-engine".  The  effect  of  such  a  call  is  to  transfer  control  to  the  label 
State_Start  of  the  subengine  an  the  time  the  subengine  is  called  and  to  return 
control  to  the  main  engine  wnen  the  subengine  completes. 

Note  that  this  tecnnique  does  not  imply  a  relationship  between  state 
transitions  and  units  of  time.  Although  the  particular  SLA  implementation 
chosen  for  Stage  4  in  this  work  is  synchronous,  a  syntax  comparable  to  the 
engine  extension  has  been  oe  mapped  to  asynchronous  implementations  [»<].  An 
algorithm  used  to  determine  the  operations  for  which  one  can  specify  parallel 
execution,  i.e.,  multiple  actions  within  the  same  state,  is  presented  in  Section 
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5. 

Building  Blocks  and  Modules 

For  the  purpose  of  tnis  report,  the  following  building  blocks  and  modules 
have  been  designed  [6]:  Equals,  Less_eq,  Bool_eq,  Counter,  Loop_Counter , 
Register,  Boolean_fiegister ,  Memory,  and  Two_D_Memory . 

Building  blocks  and  modules  generally  have  parameters  for  specifying  word 
lengths.  Such  specifications  are  provided  by  the  Stags  2  programmer  as  part  of 
an  interactive  design  process.  Thus,  most  generic  package  declarations  contain 
the  formal  generic  parameter 

type  word_length  is  range  <>; 

3.5.  Three  Intra-program  Unit  Communications  Protocols 

Three  different  intra-program  unit  protocols  are  defined,  corresponding  to 
thd  "function",  "procedure",  and  "procedurE"  Stage  2  subprogram  declarations. 
Tnese  Stage  2  declarations  convey  assumptions  about  the  number  of  states 
required  for  an  operation  to  "complete  its  job".  Different  protocols  may  be 
utilized  for  invoking  various  operations  within  an  implemented  package.  The 
corresponding  SLA  implementation  is  invoked  with  whichever  protocol  is 
appropriate.  Protocols  for  communication  between  circuits  representing  separate 
Ada  program  units  are  discussed  in  Section  6.) 

Operations  are  divided  into  two  classes:  those  that  return  a  value  (e.g.,  a 
Read  operation)  and  those  that  do  not  (e.g.,  a  Write  operation).  Hardware 
implementation  of  the  former  requires  that  the  module  includes  storage  elements 
to  nold  the  value  of  the  output  parameter  (or  function  result).  The  protocols 
presented  below  ensure  that  such  storage  elements  are  sampled  only  after  the 
correct  values  are  loaded.  In  operations  that  do  not  return  a  value,  the 
protocols  ensure  that  the  module  completes  its  job  (for  example,  modification  of 
a  global  value)  before  a  potentially  conflicting  operation  can  be  initiated. 
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Tne  distinguishing  characteristics  of  operations  adhering  to  each  of  the 
three  protocols  are  as  follows: 


'  operation  completes  in  the  same  state  in 

whicn  a  request  for  the  operation  reaches  the  containing  module.  Two 
cases  are  implementable : 


1.  The  function  result  is  always  available. 

2.  The  request  is  received  in  phase  Phi-1  of  a  given  clock  cycle 

clocHycle!^"  available  in  phase  Phi-2  of  the  same 

operation  on  a  Counter 
module)  does  not  need  to  issue  an  acknowledge  to  its  requestor  that  it 
has  performed  its  duty,  because  it  can  bf  assumed  th^  the  coSect 
result  will  be  available  in  a  known  state. 

•  ^rocgdurg”  jinotPCPl:  The  operation  completes  in  the  state  immediately 
ollowing  tne  one  in  whicn  the  request  reaches  the  module.  As  in  the 

necessary  for  the  procedure  operation 
I  such  as  the  increment  operation  on  a  Counter  module)  to  inform  the 
requestor  tnat.  the  desired  action  has  been  performed. 

operation,  it  cannot  be  assumed  that 
tne  job  will  be  completed  in  the  same  state  in  which  the  request  is 

D  otocol;  Z  two  previous 

H  necessary  for  the  containing  module  to  inform  the 

Thr  scpn-r  desired  action  has  been  completed, 

he  scenario  is  as  follows:  a  requestor  initiates  a  procedurE 
operation  py  issuing  a  "Go"  signal;  the  procedurE  in  turn  signalfuf 
calle.  ,  upon  successful  completion,  rrith  an  "I'm  done"  signal  We 
call  tnis_  convention  the  "Go/I'm  done"  protocol.  Its  use  allows  the 
introduction  of  arbitrary  delays  in  the  state  transitions  for  clocked 
scnemes  tnat  exhibit  a  single  thread  of  control.  The  protocol,  which 
IS  enforced  by  construction,  is  implemented  as  follows: 


*  Tne  requesting  engine  R  sends  a  "Go"  signal  that  invokes  the  type 
procedurt  operation  P  of  a  containing  module  M  and  then  enters  a 
state  wnere  R  waits  for  K  to  send  an  "I’m  done"  signal. 

*'  Tne  initial  state  of  M  is  a  wait  state  for  a  "Go"  signal.  A  Go 
for  P  causes  the  states  the  operation  P  to  commence  (transition 

=  i  V.  operation  P  completes  M  emits  an  "I'm  done" 

signal  before  returning  to  its  initial  state. 

Tne  protocol  permits  representation  of  a  single  thread  of  control  that 
traverses  from  the  requesting  engine  R  to  the  host  module  M  of  the 
operation  P  and  back  again.  The  sequence  of  state 
transitions  for  every  procedurE  operation  is  local  to  one,  and  cnlv 
one,  engine.  Hence,  there  is  no  possibility ' for  contention.  It  is 
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this  fsct  that  allows  us  to  use  the  siinple  "Go/I' in  Done"  protocol 
(instead  of  a  somevdiat  more  complex  Request/Acknowledge)  for  intra¬ 
engine  communication.  The  Read  and  Write  operations  on  the  Memory 
module  are  examples  of  the  procedurE  protocol. 

4.  Stage  1  to  Stage  2  Transformations 

M.1.  Transforming  Simple  Expressions 

Simple  expressions  are  transformed  in  a  straightforward  way.  Registers 
replace  variables,  comparators  replace  relational  operators,  adders  replace  plus 
signs,  etc.  Such  transformations  are  syntax  driven. 

This  style  of  transformation  leads  to  the  allocation  of  possibly  redundant 
modules.  Clearly,  circuits  produced  by  this  method  tend  to  be  wasteful  of  "real 
estate".  However,  timing  and  communications  are  simplified  in  activating 
individual  modules,  since  each  Stage  2  call  on  a  subprogram  operation  of  a 
generic  instantiation  then  corresponds  to  a  unique  control  line  in  the  hardware 
level.  Some  simple  optimizations  are  possible  within  this  framework;  for 
example,  use  of  counters  where  adders  are  not  needed,  and  use  of  shift  logic, 
where  'uitable,  for  multiplication  or  division. 


4.2.  Transforming  Control  Statements 

The  interpretation  of  control  statements  (e.g.,  loop,  case,  if,  subprogram 
calls  and  task  entry  calls)  lead  to  control  flow  changes.  We  discuss  the 
required  transformations  for  such  constructs  in  this  subsection  on  a  case  by 
case  oasis.  In  general,  these  transformations  mimic  well-understood  strategies 
used  by  compilers  Llj. 

■£jlQ.SS.d.urS5 1  functions jL  .snil  tasks  The  initial  action  to  be  performed  in  the 
body  parts  of  procedure,  function,  and  task  entries  with  in  parameters  is  the 
loading  of  the  actual  parameter  values  into  the  Registers  that  implement  the 
corresponding  formal  parameters.  Statements  directing  such  actions  must  be 
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inserted  into  the  Stage  2  program. 


Out  parameters  also  require  instantiation  of  Register  packages  so  their 
values  can  be  loaded  into  these  Registers  as  if  they  were  local  parameters  and 
hence  mimic  the  "cop:  -restore"  parameter  passing  mechanism  demanded  (for  the 
normal  case)  by  Ada  semantics.  A  similar  treatment  is  required  so  that  function 
values  can  be  properly  returned. 


Building  blocks  that  represent  formal  parameters  of  program  units  are  derived 
in  istage  2.  For  example,  if  procedure  P  and  function  F  are  specified  as: 

procedure  P( 

XX :  integer; 
yy:  integer); 
function  F( 

zz:  integer) 
return  real; 

then  four  generic  packages  are  instantiated: 

package  xx  is  new  Register( word_lengtn  =>  in  integer); 
package  yy  is  new  Register (word_length  =>  in  integer); 

—  For  P. 

package  zz  is  new  Register (word_length  =>  in  integer); 
package  f_result  is  new  Register (word_length  =>  real); 

—  For  F. 


I'x-’-STATEMENTS  In  the  simplest  case,  if-statements  are  manifested  in  Stage  2 
as  structures  of  tne  form: 

<<State_for_if»  if  condition  then 

goto  State_X; 
else 

goto  State_Y ; 
end  if; 

Missing  but  implicit  else  clauses  are  explicitly  inserted.  For  example: 

else 

goto  State_<the_state_where_the_2_branches_join> ; 
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It  is  certainly  possible,  and  in  many  cases  advisable,  to  include  actions  in 
the  branches  before  the  goto  statement,  thereby  reducing  the  total  number  of 
states  specified  in  the  machine  description.  For  example, 

if  mem_value  =  0  then 
pointer  :=  p_find; 
exit; 
end  if; 


is  transformed  into 

declare 

equals_result:  boolean  :=  false;  —  Initialized  to 
...  —  false . 

begin 

<<State_4>>  Equals. Test ( 

Mem_value.Lookup(J,  0,  equals_result) ; 
goto  State_5; 

<<State_5>>  if  equals_result  then 

Pointer. Write(P_find.Lookup( )) ; 
goto  State_6;  —  Goes  to  exit. 

®lse  —  Else  is  now  explicit 

goto  State_7 ; 
end  if; 


Notice  the  use  of  the  boolean  variable  "equals _ result"  to  represent  the  value  of 

tne  condition.  The  rule  followed  is  that  the  use  of  identifiers  with  "_result" 
as  a  suffix  specifies  Stage  4  routing  to  a  storage  element  that  is  located 
within  tne  module  specified  by  the  prefix  (e.g..  Equals).  The  storage  element 
is  loaded  with  the  result  of  the  operation .  Every  relational  operator  building 
block  has  such  a  "buddy"  boolean  variable.  Out  parameters  in  procedures  and 
procedurEs,  such  as  the  value  returned  from  a  memory  Read  procedurE,  are  also 
treated  this  way. 


BLOCKS  A  block  is  treated  as  a  parameterless  procedure. 


£0J-E00P$  A  generic  Loop_Counter  package  that  computes  and  holds  the  loop 
parameter  value  is  instantiated  for  each  Stage  1  for-loop.  This  package  also 
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stores  the  value  of  the  upper  limit  of  the  discrete  range.  In  case  the  upper 
bound  is  a  previously  declared  variable,  e.g.,  Lim,  a  module  that  stores  Lim's 
value  already  exists,  so  the  extra  storage  element  is  redundant.'  This 
redixndancy  is  accepted  because,  at  the  hardware  level,  the  simplicity  of 
communication  and  saving  of  extra  communications  lines  appears  to  outweigh  the 
use  of  extra  storage  space.  Figure  H-1  shows  the  Stage  1  to  Stage  2 
transformation  paradigm  used  for  for-loops . 


STAGE  1 


STAGE  2 

—  Declaration  part 

package  Parameter  is  new  Loop_Counter ; 

—  Instantiation. 


for  parameter  in  A..B 
loop 

Statement_1 ; 
Statement_2; 


—  Body  part 

<<State_X>>  Parameter .Load  (A,  B) ; 

—  Load  loop  values. 

—  A  is  initial  value. 
—  B  is  upper  limit. 
«State_Y»  if  Parameter.Test( )  then 
—  Test  the  parameter 

—  versus  upper  bound, 
goto  State_Y+1 ; 

—  Go  to  the  sequence 

—  of  statements, 
else 

goto  State_Z+1 ; 

—  Exit  from  loop, 
end  if; 

<<State_Y+1 >>  Statement_1 ; 


Statement_N ; 
end  loop; 


<<State_Y+2>>  Statement_2; 
<<State_Y+N>>  Statement_W ; 


«State_Z» 

<<State_Z+1» 


Parameter. Increment( ) ; 
goto  State_Y; 

—  Go  back  to  the  test. 

—  Continue  with  the 
—  rest  of  the  program. 


Figure  4-1:  A  Paradigm  For-Loop  Transformation 
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WHTI.K-LOOPS  While-loop  transformations  require  the  instantiation  of  as  many 
building  block  packages  as  required  to  evaluate  the  while-loop  condition.  The 
Stage  2  expression  of  a  while-loop  whose  condition  is  a  simple  equality  ' test  is 

modeled  in  Figure  M-2. 

<<State_Y>>  Equals. Test( 

first_operand ,  second_operand,  equal s__result;  | 

goto  State_Y+1) 

<<State_Y+1»  if  equals_result  then 
goto  State_Y+2; 
else 

goto  State_Z+1;  --  Exit  the  loop, 
end  if; 

«State_Y+2»  Statement_1;  —  Begin  loop  body. 


<<State_Y+W»  Statement_N ; 
<<3tate_Z>>  goto  State_Y; 
<<3tate_Z+1»  —  ...rest  of  program 


—  End  loop  body. 


Figure  4-2:  Stage  2  Representation  of  a  While-Loop 


5.  Thoughts  towards  a  compiler 

The  method  just  presented  informally  emulates  a  multi-pass  compiler  that 
accepts  as  input  a  Stage  1  Ada  program  (i.e.,  a  "normal-  program  confined  only 
by  restrictions  we  may  choose  to  impose  on  the  use  of  Ada)  and  produces  a  Stage 
2  program,  whicn  is  also  legal,  tnough  "stylized"  Ada  code.  This  method  is 
«-compiler-like"  in  the  sense  that  it  is  syntax  driven  and  in  that  the 
transformations  are  viewed  as  production  rules. 

Tne  Stage  1  to  Stage  2  transformation  involves  several  passes  over  a  program 
unit.  Backtracking  within  a  given  pass  is  sometimes  necessary.  For  instance,  a 
pass  may  begin  by  scanning  the  program  unit  and  declaring  the  instantiation  of 
all  generic  package  objects  that  can  be  determined  at  that  time,  and  may  end 
witn  the  declaration  of  more  package  objects  that  have  been  determined  to  be 
necessary  while  scanning  the  code.  The  passes  can  be  organized  as  follows: 
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Pass  1  -  Transforms  the  declaration  part  of  the  program  unit  and  the 
simple  statements.  Declares  and  instantiates  packages  that  correspond 
to  formal  parameters  and  inserts  code  to  write  the  actual  parameter 
values  into  these  packages. 

Pass  2/Part  A  -  Transforms  compound  statements,  that  is,  loops  if 
statements,  accept  statements  and  blocks.  (Simple  statements 

exposed  in  this  step  are  also  transformed.)  Records  situations  that 
require  backtracking.  Also  records  situations  that  require 
packages  to  be  instantiated. 


new 


Pass  2/Part  B  -  Backtracks  and  replaces  "temporary"  state  markers  with 
appropriate  state  numbers. 

Pass  3  -  Instantiates  new  packages  whose  need  has  been  previously 
recorded.  Transforms  expressions  that  involve  relational  operators 

and  expressions  that  similarly  involve  an  increase  in  the  number  of 
states . 


5.1.  Determining  concurrency  within  a  state 

Determining  which  actions  may  take  place  in  parallel  is  an  important  part  of 
tne  methodology.  Reasoning  can  be  applied  to  specific  cases  based  on  the 
function,  procedure,  and  procedurE  specifications.  However,  a  general  rule  is 
desirable.  The  following  principles  (constraints)  are  adhered  to: 

1 .  At  the  Stage  2  level  no  two  operations  of  a  given  package  instance 
may  be  called  within  a  given  state.  This  applies  both  to  multiple 
calls  on  a  single  subprogram  contained  in  a  generic  package  instance 
and  to  single  calls  on  different  subprograms  of  the  same  package 
Thus,  the  calls 

Point . Load ; 

Point .Test ; 

must  be  invoked  in  separate  states,  whereas 

Point .Load ; 

Slot . Test ; 
or 

Point. Load; 

Slot .Load ; 

may  be  initiated  concurrently. 

2.  After  receiving  an  appropriate  "Go"  signal,  a  module  M  (executing  a 
type  procedurE  operation)  will  not  recognize  another  "Go"  signal  sent 
from  a  module  N  until  after  M  raises  the  matching  "I'm  done”  signal. 

If  a  module  N  were  to  send  such  a  signal,  its  "Go"  signal  will  be 
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ignored  and  the  action  that  N  requests  of  M  would  never  take  place. 
Furthermore,  N  runs  the  risk  of  mistakenly  viewing  the  "I*m  done" 
signal  M  sends  upon  completion  of  the  previous  operation  as  intended 
for  N  and  will  therefore  proceed  Ifl.  error. 

b.  The  hardware  modules  developed  in  this  report  have  no  underlying 
storage  resource  management:  they  allow  for  only  one  "activation 
record"  at  a  given  time.  Thus,  overlapping  invocations  will  result 
in  undefined  behavior. 

The  rule  is  sufficient  for  our  purposes  to  ensure  proper  behavior  but  no 
claim  is  made  that  it  is  always  necessary.  (iJote  that  Ada  semantics  permit 
concurrent  activations  of  operations  within  a  package,  although  such 
permissiveness  can  lead  to  non-deterministic  behavior.)  The  fact  that  a  unique 
module  is  created  in  hardware  for  every  variable,  every  computation  (e.g., 
addition),  and  every  canparison,  suggests  that  control  line  conflicts  will  be 
avoided  as  long  as  no  module  is  presented  with  more  than  one  command  at  a  time. 

6.  Stage  3:  Protocol-definition  Ada  program 

An  Ada  task  defines  a  distinct  thread  of  control.  Ordinary  subprogram  calls 
by  a  tasK  T  are  regarded  as  traversals  along  this  thread  of  control.  Since 
contention  for  subprogram  activation  has  been  eliminated  by  the  constraints  we 
have  imposed,  Go/I'm  done  protocols  can  be  used  safely  in  such  cases.  Inter¬ 
task  communication  is  more  complex  since  two  separate  threads  of  control  are 
involved  and  since  contention  is  possible.  Such  communication  is,  therefore, 
implemented  with  a  four-cycle  fiequest/Acknowledge  protocol.  Implementation 
details  for  both  Kinds  of  communication  are  introduced  in  the  transformation 
from  Stage  2  to  Stage  3- 

6.1.  Motivation  for  Stage  3 

Like  its  predecessor,  tne  Protocol-definition  stage  is  specified  in  legal  Ada 
code.  The  discipline  introduced  in  Section  3  is  extended.  The  Protocol- 
definition  stage  realizes  two  goals: 


1.  New  states  are  inserted  and  "Line"  packages  are  instantiated  to 


r 
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specify  protocols  for  communication  between  the  program  units 
expressed  in  the  Stage  1  code. 

Kote  that  the  transformations  presented  thus  far  have  been  concerned 
with  communications  within  a  given  Stage  1  program  unit.  Since  each 
of  the  original  program  units  maps  into  a  unique  state  machine/data 
path  pair  (engine),  task  entry  calls,  procedure  calls,  and  function 
calls  between  these  units  cannot  be  represented  by  simple  control 
line  assertions.  Instead,  such  communication  must  be  implemented 
either  using  Request/Acknowledge  or  Go/I’m  Done  protocols. 

2.  State  label  numbers  are  converted  to  binary  numbers,  primarily  to 
facilitate  the  encoding  of  the  Stage  3  body  part  as  an  SLA  state 
machine,  which  takes  place  in  Stage 


In  the  transformation  to  Stage  3,  the  list  of  declared  hardware  modules  is 
completed  and  the  state  machine  is  reduced  to  a  sequence  of  if-statements ,  goto 
statements,  and  subprogram  calls  representing  control  line  assertions. 


6.2.  Implementing  Inter-Program  Unit  Communications  Protocols 

Stage  3  inserts  protocols  only  for  those  program  units  that  are  originally 
specified  in  Stage  1.  Protocols  are  already  defined  (in  Stage  2)  for  program 
units  that  are  introduced  as  a  result  of  building  block  generic  package 
instantiations . 


In  hardware  .■epresentation  each  inter-engine  communication  requires  two 
communications  lines.  Each  line  (i.e.,  wire)  is  realized  by  the  instantiation  of 
the  generic  package  named  "Lino".  The  specification  part  for  Line  is: 

generic 

package  Line  is 
procedure  Lift; 

—  Function: 

Assigns  the  logical  value  1. 
procedure  Lower ; 

—  function: 

Assigns  the  logical  value  0. 
function  Test  return  boolean; 

—  Function: 

Returns  true  if  wire  has  logical  value  1, 
else  returns  false. 

end  Line ; 


t 


21 


An  instance  of  this  package  corresponds  to  a  physical  line  whose  level  may  be 
lowered,  raised,  or  tested. 


6.2.1.  Transforming  Procedure  and  Function  Calls 

A  procedure  or  function  X  is  mapped  from  Stage  2  to  Stage  3  as  follovrs: 

1.  Line  packages  X.Go  and  X.Done  are  instantiated. 

2.  The  decision  "if  X_Go.Test()"  is  inserted  as  the  initial  state.  (The 
machine  remains  in  this  state  until  X_Go.Test  becomes  true.  Lines  are 
always  initialized  to  the  logical  value  0,  regarded  here  as  false.) 

3.  "X_Done.Lift"  is  made  the  action  of  the  final  state.  The  state 
macnine  of  X  takes  the  necessary  actions  to  allow  the  caller  to  "see" 
the  return  values  at  the  same  time  X_Done  is  sensed  true. 

Program  units  that  contain  procedure  and  function  calls  to  other  program  units 

oust  also  be  transformed  to  reflect  the  calling  protocol.  For  example,  the 

action : 

<<5tate_l»  X(some_arguments) ;  —  Call  on  X 

goto  State_2; 

is  transformed  into: 


<<State_1»  XjGo.Lift; 

X(some_arguments);  —  The  original  action, 
goto  3tate_2; 

<<State_2>>  if  X_Pone.Test  then 

—  Load  tne  out  parameters/function  result 
—  into  proper  register(s). 
goto  State_3; 
else 

goto  State_2; 
end  if; 

liotice  tnat  the  original  invocation  of  X  is  left  in  the  code. 

6.2.2.  Transforming  Task  Entry  Calls  and  Accept  Statements 

The  transformation  of  tasks  is  similar  to  that  for  subprograms.  The  scheme 
outlined  in  the  previous  subsection  is  followed,  although  "X_Req"  is  substituted 
for  "X_Go"  and  "X_Ack"  is  substituted  for  "XJ)one".  Additionally,  a  Line 
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package  is  instantiated  for  each  entry  statement  of  the  task.  This  Line  and  the 
XJ^eq  Line  are  "raised"  concurrently  by  the  calling  task  (via  a  calls  to  the 
respective  Lift  procedures) .  Each  accept  alternative  in  the  receiving  task 
tests  the  tasks  request  line  and  the  corresponding  entry  statement  line  before 
performing  the  desired  operation.  As  an  example,  consider  the  task  named 
"Storage"  that  models  a  Read/Write  memory.  Storage  is  specified  in  Stage  1  as: 

task  Storage  is 
entry  Read( 
address:  integer; 
value;  out  integer); 

entry  Write( 

address:  integer; 
value;  integer); 
end  Storage ; 

The  instantiations 

package  Storage_Req  is  new  Line; 
package  Storage_Ack  is  new  Line; 
package  Storage_Read  is  new  Line; 
package  Storage_Write  is  new  Line; 

must  be  visible  to  Storage  and  all  tasKs  which  can  call  it. 


The  body  of  Storage  is  realized  as: 
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<<State_0000»  if  Storage_Reacl.Test()  and 

StoragejReq.Test( )  then 
goto  State_000l; 
elsif  Storage_Write .Test( )  and 

Storage_Req.Test()  then 
goto  State_0100; 
end  if; 

<<State_000 1 >>  accept  Read( 

address:  integer; 
value:  out  integer) 

do 

—  Perform  read  operation. 

—  This  may  take  several  steps 

—  in  the  general  case  but  here 

—  we  simplify  to  one  step, 
end  Read; 

goto  State_0010; 

<<State_00 10>>  Storage_Read . Lower ( ) ; 

goto  State_0110; 


<<State_0 100>>  accept  Write ( 

address:  integer; 
value:  integer); 

do 

—  Perform  write  operation  . 

end  Write ; 
goto  State_0101; 

<<Stats_010 1 >>  Storage_Write  .  Lower ( ) ; 

goto  Stac.e_0110; 

<<State_01 10>>  Storage  ^ck.LiftO; 

—  Raise  tne  acknowledge  line, 
goto  State_011'i; 

<<State_0l 1 1 >>  if  Storage_Req.Test( )  then 

--  Keep  Ack  high  until  Req  is  lowered. 
Storage_AcK. Lift( ) ; 
goto  State_0111; 
else 

Storage_Ack. Lower ( ) ; 
goto  State_<some_next_state> ; 
end  if; 
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A  Stage  1  call  on  the  Storage  write  operation  such  as 

<<State_4»  Storage  .Write  ( 

1, 

Some_Value  .Read( ) ) ; 
goto  State_5; 

is  realized  in  Stage  3  as: 

<<State_1000>>  Storage_Req. Lift( ) ; 

Storage_Write . Lift( ) ; 

Storage . Write ( 

1, 

Some_Value  .Read( ) ) ; 
goto  State_1001; 

<<State_1001>>  if  Storage_Ack.Test()  then 

Storage_Req. Lower( ) ;  —  Test  acknowledge  line, 

goto  State_<some_next_state> ; 
else 

Storage_Req. Lift( ) ; 
goto  State_1001; 
end  if; 


Note  that  the  effects  of  these  transformations  are  to: 


1.  Force  tasks  to  follow  standard  Request/Acknowledge  protocol. 

2.  Create  an  implicit  case  statement  which  directs  the  proper  accept 
alternative  choice  (e.g.,  State_0000  above). 


o.p.  Transformation  to  Binary  Numbers 

in  Stage  4,  states  are  encoded  as  a  series  of  “0"  and  "1"  cells  that  are 
connected  to  SR  flip-flops.  for  example,  <<State_0n0»  is  realized  by  placing 
”0",  ”1",  "1",  and  "0"  cells  in  the  same  row  (AND  plane)  in  adjoining  columns  a 
matrix  called  and  SLA.  The  level  associated  with  this  row  is  "raised"  whenever 
tnat  sequence  of  values  0110  is  stored  collectively  in  the  flip-flops.  We 
regard  raising  this  row's  level  as  equivalent  to  being  in  State  0110. 


—  Raise  request  line. 

—  Raise  write  accept  line. 


To  facilitate  this  encoding,  state  label  numbers  are  transformed  to  binary 
representations  as  the  last  action  of  Stage  3.  With  the  completion  of  the  state 
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expansions  outlined  earlier  in  this  section,  the  state  machine  is  fully 
specified . 

In  summary,  Stage  2  to  Stage  3  transformations  can  be  performed  in  two 
passes.  The  first  pass  inserts  the  necessary  state  and  package  instantiations 
to  specify  the  communications  protocols.  The  second  pass  converts  the  state 
label  numbers  to  binary  numbers. 

7.  Stage  SLA  Program 

This  section  discusses  SLA  programs  and  their  derivation  from  Stage  3. 

7.1.  Background  and  Use  of  SLA  Programs 

SLA  is  an  acronym  for  Storage  Logic  Array.  SLA  methodology  lends  itself  to 
the  realization  of  interacting  state  machine/ environment  pairs;  they  are  used  to 
describe  both  the  state  machine  and  the  data  path  components.  The  SLA  concept 
was  originally  conceived  by  S.  Patil  [15]  [14],  extended  by  Patil  and 

Welcn  Ll2j  LljJ,  and  further  extended  by  K.  Smith  [18].  Simply  put,  SLAs  are 
"folded"  Programmable  Logic  Arrays  (PLAs)  in  which  column  and  row  breaks  in  both 
the  Ah’D  and  OR  planes  allow  tne  design  of  independent  arrays  in  the  same 
circuit.  "Programming"  an  SLA  involves  the  placement  of  symbolic  elements  (with 
tne  nelp  of  an  editor)  in  a  manner  tnat  may  result  in  representing  an  arbitrary 
number  of  independent  finite  state  machines  whose  interconnection  is  specified 
by  tne  SLA  program.  These  symbolic  elements  may  then  be  automatically 
translated  into  IC  layout  masks  in  the  appropriate  circuit  technology.  The 

translation  of  the  SLA  program  into  an  integrated  circuit  can  be  viewed  as  the 
actual  placement  of  finite  SLA  machines  onto  the  active  area  of  the  chip.  SLA 
programs  make  it  easy  for  the  designer  to  visualize  the  physical  layout  of  the 
circuit  from  its  logical  description.  A  designer  who  thinks  primarily  in  terms 
of  the  functional  description  effectively  specifies  the  physical  layout  as  well. 
Smitn  and  co-workers  have  designed  SLAs  in  I^L,  NMOS,  and  CMOS  technologies 
[Id].  More  recent  work  by  Smith’s  group  has  extehded  the  SLAs  based  on  a  new 
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concept  for  cell  set  design.  The  new  circuits,  called  PPLs,  are  being  primarily 
applied  in  the  design  of  asynchronous  state  machines  [i|]. 

Our  method  uses  SLAs  in  two  ways: 

1.  The  SLA  modules  previously  developed  are  treated  as  hardware 
components  that  replace  the  Stage  3  generic  packages.  Note  that  no 
formal  method  is  employed  for  the  design  of  the  SLA  modules. 

However,  each  module  has  been  simulated  independently  to  test  its 
correctness. 

Tne  state  machines,  including  control  and  feedback  lines,  are  encoded 
as  SLAs  L  i3 J . 

we  use  SLA  cells  to  build  a  library  of  composite  "macros",  which  are  the 
Stage  ^  modules  described  in  Section  5.  These  modules  comprise  the  data  path 
ana  are  inserted  using  a  cell  substitution  approach.  In  this  sense  our  use  of 
SLAs  is  similar  to  the  use  of  macro  cells  [33  and  Associative  Logic  [?]. 

The  particular  cell  set  employed  in  this  work  was  the  5  micron  NMOS  set 
described  in  [17j.  An  SLA  editor  (SCLED  [20])  and  a  SLA  simu.''c.tor  (NSIM  [19]) 
were  built  and  tested  at  Utah;  both  were  used  extensively  in  this  study. 

7.2.  Encoding  of  State  tochines 

ihe  itage  i  specification  of  a  state,  say.  State  0110,  results  in  the 
connection  of  tne  appropriate  SLA  cells  such  that  the  row  correspondi.ng  to  State 
0110  goes  high  at  the  proper  time.  Further,  in  each  state  the  levels  on  columns 
"connected"  to  the  row  of  a  given  state  are  raised  when  the  SLA  is  in  that 
state.  Tnese  columns  are  the  sources  of  the  control  lines,  which  correspond  to 
tne  operations  to  be  initiated  in  that  state.  A  two-pass  method  is  employed  to 
accomplish  the  desired  encoding.  This  technique  is  presented  by  referring  to  a 
simple  example.  Consider  the  Stage  1  if-statement  construct: 
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if  A  =  B  then 
C  :=  C  +  1; 
else 

A  :=  B  +  1; 
end  if; 


With  the  assumptions  that  "A"  maps  into  a  Register  while  "B"  and  "C"  map  into 
Counters,  this  construct  could  be  specified  in  Stage  3  as: 

<<State_0000>>  Equals. Go(A. Read,  B. Lookup,  equals_result) ; 
goto  State_0001; 

^^otate _ 000 if  GQuals_result  then 

goto  State_0010; 
else 

goto  State_001  i ; 
end  if; 

<<State_00 10>>  C. Increment; 

goto  State_0110; 

<<State_00 1 1 >>  B. Increment; 

goto  State_0100; 

^<State_0100>>  A.  Write(B.  Lookur  '' ; 

goto  State_0101; 

<<State_010 1 >>  B. Decrement; 

goto  State_0110; 

<<State_01 10>>  null; 


in  the  first  pass,  the  states  of  Stage  3  are  scanned  sequentially.  Every 
function  and  procedure  call  on  a  generic  package  instantiation  in  Stage  3  is 
transformed  into  the  raising  of  a  control  line  when  tne  row  corresponding  to  the 
given  state  "goes  high".  If-statement*.  are  transformed  into  two  rows,  one  for 
eacn  possible  result  of  the  if.  The  state  machine  layout  rules  employed  are: 

1.  For  simplicity,  columns  representing  test  inputs  and  control  line 
outputs  that  are  used  to  communicate  with  other  state  machines 

(program  units)  are  placed  on  the  left  of  the  state  machine  and  those  f 

that  communicate  to  local  modules  are  placed  on  the  right. 

2.  Rows  and  columns  are  annexed  as  needed  as  the  Stage  3  states  are 
scanned.  When  a  new  Stage  3  subprogram  call  is  discovered,  a  column 
is  designated  to  carry  the  corresponding  control  line. 


r 
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^  Figure  7-1  presents  the  result  of  the  initial  encoding  pass  over  the  Stage  3 

code  presented  above. 

0000000000000000 

0000000001111111 

123^567890123456 

1:F  F  F  F  OBOBOBOB; 

2:  j 

5: 

4: 

) 

5:  : 

6:  ; 

7:0  0  0  OS  +  +  +  j 

8:  0  0  0  S  1  R  0  ; 

9:  0  0  0  S  1  1  ; 

10:  0  0  S  1  0  +  ; 

11:  0  0  S  1  R  1  R  +  ; 

12:  0  1  0  0  S  +  +  ; 

13:  0  1  0  S  1  R  +; 

14;  0  1  1  0  ; 

I  I  I  I  I  I  I — >  B.  Decrement 

I  I  I  1  I  I  I - >  A. Write 

I  i  I  I  1  1 - >  B.  Increment 

I  I  I  I  1 - >  C.  Increment 

1  1  I  1 - >  B. Lookup 

I  I  I - ->  A. Read 

1  1 - >  Equals. Go 

1 - result  from  Equals 

Figure  7-1:  First  Pass  Stage  4  Encoding 


woce  how  state  0000  (row  7)  raises  columns  10,  11,  and  12.  This  row 

5  corresponds  to  the  ’’Equals.  Go(  A. Read ,  B.  Lookup,  ... )”  operations  specified  for 

state  0000  in  the  Stage  3  code  above.  State  0001  (rows  8  and  9)  corresponds  to 
the  if-statement .  Row  6  "goes  hign"  if  the  result  from  the  comparator  carried 
in  column  9  is  false  (i.e.  a  /=  b) .  Row  9  goes  high  if  the  result  is  true  (a  = 
b) .  Note  how  new  columns  are  added  on  the  right  as  new  procedure  and  function 
Calls  are  scanned  in  the  Stage  3  code.  Note  also  how  the  B.  Lookup  (column  12) 
is  raised  in  State  0000  (row  7)  and  in  State  0100  (row  12).  The  second  time 
B.  Lookup"  is  scanned  in  the  Stage  3  code  we  remember  that  a  column  was  already 
dedicated  to  this  control  line;  we  don't  dedicate  another.  Since  this  simple 
circuit  does  not  communicate  with  other  state  machines,  all  control  line  firings 
are  on  the  right  side. 
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In  the  first  pass  tha  ,  "1",  and  "0"  cells  are  placed  only  as  the  need  for 
them  is  discovered.  A  dispersed  layout  often  results.  The  second  manual  pass 
re-arranges  the  control  lines  to  group  lines  that  are  directed  to  the  same 
module.  Thus,  the  second  pass  merely  clusters  the  control  lines,  arranging  them 
according  to  their  destination.  The  effect  of  the  second  pass  is  to  simplify 
routing  of  the  control  lines  to  the  modules.  Figure  7-2  presents  the  result  of 
re-arranging  of  the  columns  of  Figure  7-1.  Note  how  commands  going  to  the  same 
module  are  now  on  adjacent  columns. 

0000000000000000 
0000000001111111 
1234567890123456 
1:F  F  F  F  ObOBOBOB: 


0  OS  +  + 
0  S  1  R  0 
0  S  1  1 


10:  0  0  S  1  0 

11:  0  0  S  1  R  1  R 

12:  0  1  0  OS 

13:  0  1  0  S  1  R 

14:  0  1  1  0 


j  l-->  C. Increment 
1 - >  B. Decrement 


1 - >  5. Increment 

- >  B. Lookup 

- >  A. Write 

- >  A. Read 

- >  Equals. Go 

result  from  Equals 


Figure  7-2:  Second  Pass  Stage  4  Encoding 


7.3.  Layout,  Routing  and  Busing  Issues 

An  algorithmic  method  for  cell  layout  and  routing  has  not  yet  been 
incorporated  into  our  method.  Reference  [6]  discusses  a  simple  manual  routing 
method  tnat  utilizes  the  fact  that  the  declaration  part  of  a  given  Stage  3 
program  unit  specifies  the  modules  utilized  by  that  unit. 


I 
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As  mentioned  earlier,  engines  that  are  physical  representations  of  tasks 
communicate  through  the  use  of  the  Request/Acknowledge  protocol.  In  the 
hardware  rea^m,  such  engines  communicate  via  buses.  A  circuit  derived  by ’our 
method  may  include  several  buses,  which  may  be  private  (non-contenc iqn)  or 
public  (with  potential  for  contention  between  the  users).  Both  types  support 
the  Request/Acknowledge  protocol.  It  is  well-known  that  a  Request/Acknowledge 
protocol  strategy  will  not  work  on  a  contention  bus  without  some  sort  of 
arbitration  mechanism.  The  Request/Acknowledge  protocol  implemented  here 
closely  follows  the  scheme  outlined  by  Seitz  [16],  and  appears  to  be  adaptable 
to  his  arbitration  scheme.  Bus  issues  are  detailed  further  in  [6j. 

8.  Conclusions 

The  transformation  methodology  described  in  the  preceeding  sections  was 
developed  and  exercised  in  conjunction  with  an  extensive  and  non-trivial  case 
study  [6].  The  algorithm  developed  for  that  exercise  is  a  possible  model  for 
the  behavior  of  the  Ada  selective  wait  statement,  itself  initially  specified  as 
an  Ada  program  consisting  of  a  set  of  intercommunicating  Ada  server  and 
requestor  tasKs.  Tne  transformation  rules  were  only  applied  to  a  subset  of  the 
program.  Application  of  the  rules  resulted  in  two  SLA  programs  whose  behavior 
was  tested  with  the  simulator  ASIM. 

Tne  case  study  [6)  provided  a  "real"  example  of  rule-based  transformations 
whicn  covers  the  significant  portion  of  the  Ada-to-Silicon  "spectrum".  No 
theoretical  stumbling  blocKs  were  encountered  in  this  process,  which  suggests 
that  there  is  nothing  in  principle  to  invalidate  the  concept  that  such 
transformations  may  be  automated.  On  the  other  hand,  we  have  not  yet  formalized 
tnese  transformation  rules  as  concrete  algorithms.  There  is  the  additional 
cnallenge  of  reaching  practical  and  competitive  circuits  with  this  approach. 

wc  have  experimented  the  intriguing  concept  of  using  Ada  itself  as  an 
intermediate  language  in  the  mapping  process.  For  this  purpose  we  have  found 
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important  ways  to  exploit  Ada's  abstraction  features: 


1 .  In  mapping  Ada  program  variables  to  instantiations  of  generic 
packages  to  pre-defined  IC  modules. 

2.  In  mapping  Ada  subprogram  and  task  calls  to  specific  hardware 
protocols . 

The  end  result  of  successful  research  in  thi.s  area  can  be  that  the 
traditional  hardware  logic  design  activity  will  become  increasingly  a 
programming  activity  that  is  keyed  to  the  use  of  high-order  programming 
languages  for  system  specification.  Such  an  evolution  will  progress,  however, 

only  as  rapidly  as  we  succeed  in  evolving  a  new  class  of  high-quality  compilers 
for  hardware. 
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Abstract 

This  describes  the  status  of  the  Internet  Protocol  (IP)  example  being  pursued  as  a  case  study 
by  the  Utah  Ada  to  Silicon  Project.  This  document  provides  three  contributions:  (1)  A  general 
introduction  to  the  Internet  Protocol  for  those  unfamiliar  with  it,  (2)  A  discussion  and  "road 
map"  through  the  structure  of  the  Ada  code  that  specifies  the  submodule  representing  IP, 
whidi  we  have  named  INM_  OUT,  and  (3)  A  complete  listing  of  the  source  Ada  code  for 
INM—  OUT  that  is  being  used  to  guide  the  transformation  of  this  submodule  into  silioon. 
Parts  1  and  2  summarize  the  function  of  the  IP  and  our  major  design  decisions. 

Other  references  [2,  3,  4]  also  include  discussions  of  the  IP  case  study  and  our  approadi  to 
mapping  the  IP  into  silicon.  The  source  listings  in  part  3  have  been  compiled  using  the  Intel 
432  Ada  compiler  version  available  to  us  at  this  time.  We  have  coded  the  complete 
INM  _  OUT  submodule  in  Ada  and  have  succeeded  in  compiling  most  of  it  for  execution  on  the 
Intel  tAPX  432  system  except  for  statements  and  dedarations  associated  with  uses  of  the  Ada 
rendezvous  constnict. 

[As  later  versions  of  the  Intel  compiler  become  available,  we  expect  not  only  to  be  able  to 
compile  the  full  module  using  rendezvous  syntax  and  semantics,  but  to  execute  it  in  this  mode 
as  well.  In  the  meantime  we  are  working  with  a  version  of  the  code,  not  given  in  this  report, 
that  simulates  each  rendezvous  via  Send /Receive  primitives  instantiated  through  use  of  the 
Ada  generic  package  mechanism.] 
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1.  Whatis  the  hitemet Protocol? 

transmituig  meJagra  from  ^ne'lJ^put°eV  °  “““ 

system  of  networks  (or  "nets").  W  e  term  eadi  over  an  interconnected 

ttough  .uch  I,  net  „uld  be  »orld,?de  ™eSS  eefe 

^ese  netwoAs  le  called  the  Internetwork,  or  "Catenet"  aaaemblage  of 

more  local  nets  are  called  "gateways".  l-atenet  .  Hosts  directly  interfacing  to  two  or 

The  primary  reference  for  the  fP  iq  fsl  n  »  »• 
explicit  attribution  are  taken  from  this  iSerBnra°^^*°"^  appearing  in  this  document  without 

1.1.  Protocol  Hierarchies 

miairaKspe'^^^^^^^^^  'f-^dZ^Th 


I  higher-level  | 

i . -P- . ; 

I  LNP  I 

I  line  protocol  I 


Figure  1  1;  Protocol  layering. 
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at  each  level  certain  aspects  rf  the  overa  1  abstracUon.  That  Is. 

are  solved,  and  rendered  Invisible  to  hilh^  1  communication  problem 
shall  see  that  the  IP  deals  with  n^f  example,  we 

under  the  abstraction  of  essentially  unliiidtedtadcSM°  tunAlon 
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receiving  modules  communicate  through  parameters  packed  In  the  headers  of 
data  passed  to  the  next  level. 


1.2.  The  Role  of  the  IP 

The  IP  fundamentally  provides  a  means  of  transmitting  uninterpreted  messages  {segments) 
between  Hosts  on  possibly  different  local  nets.  The  INHs  accomplish  this  transmission  by 
packaging  these  segments  in  special  data  blocks  termed  datagrams,  for  transmission  via  one 
or  more  local  nets. 

In  performing  its  part  of  this  internetwork  service,  the  IP  is  concerned  with  two  principal 
dutii^s: 

1.  Internet  addressing:  pidcing  the  desired  "next  hop"  gateway  for  nonlocal 
messages,  and 

2.  Fragmentation  and  reassembly,  splitting  and  merging  messages  that  cannot  be 
transmitted  intact  due  to  inadequate  local  net  padcet  sizes. 

These  duties  can  be  explained  metaphorically  as  follows.  The  IP  functions  like  a 
department-to-department  mail  service  within  an  industrial  organization.  Eadi  department 
has  a  mail  room,  which  deals  with  one  or  more  courier  services.  W  hen  someone  in  a  source 
department  has  an  item  to  send  to  another  department,  he  or  she  wraps  it  in  an  unmarked 
folder  and  deposits  it  In  an  out  basket  of  the  local  mail  room,  with  a  delivery  slip  attadied 
giving  instructions. 

The  mail  room  prepares  the  folder  for  transmittal  by  inserting  it  into  a  company  mail 
envelope,  with  the  delivery  instructions  written  on  its  exterior.  It  then  selects  a  courier 
serving  the  destination  department's  mail  room,  and  gives  the  envelope  to  the  service's  agent. 
The  agent  then  puts  the  company  mail  envelope  into  one  of  the  service's  own  standard 
envelopes,  and  enters  it  into  its  shipping  system.  At  the  destination  the  process  is  reversed: 
the  courier  agent  strips  off  the  courier  service  envelope  and  delivers  it  to  the  mail  room,  whidi 
in  turn  recreates  a  delivery  slip  from  the  instructions  on  the  company  mail  envelope,  strips  of 
the  company  mail  envelope  and,  puts  the  folder  (with  delivery  slip  attached)  into  one  of  the 
department  's  in  baskets.  The  in  basket  is  selected  according  standing  processing  instructions, 
based  on  the  contents  of  delivery  slips. 

However,  two  complications  may  arise  in  accomplishing  this  folder  transmittal: 

1.  The  courier  services  available  to  the  source  mail  room  may  not  directly  service 
the  destination  department.  In  this  case,  the  mail  room  determines  a  (remote) 
courier  service  directly  serving  the  destination,  and  looks  the  service's  name  up 
in  a  routing  table.  This  table  gives  the  name  of  a  department  whose  mail  room 
has  agreed  to  transfer  mail  to  the  destination  department,  as  well  as  the  name  of 
a  courier  directly  serving  the  transfer  department  The  source  mail  room  then 
gives  its  company  mail  envelope  to  the  shared  courier  service,  which  conveys  it  to 
the  transfer  department’s  mail  room.  The  envelope  is  then  relayed  out  via 
another  courier  service,  which  the  transfer  mail  room  determines  according  to  its 
own  routing  table. 

2.  The  second  difficulty  may  be  that  the  given  folder  size  exceeds  the  capacity  of 
largest  envelope  available  from  the  selected  courier  service.  In  th*s  case,  the 
mail  room  takes  the  liberty  of  partitioning  the  folder's  contents  so  that  each 
portion  will  fit  into  a  service  envelope.  However,  before  passing  each  portion  to 
the  courier  agent,  it  marks  on  the  portion's  company  mail  envelope  that  portion’s 
sequential  position  in  the  original  folder.  This  permits  the  portions  to  be 
reassembled  into  one  folder  in  the  destination  mail  room. 

This  thinly  disguised  analogy  maps  into  the  IP  world  as  follows: 

—A  department  Is  a  H  ost,  and  a  cowrier  service  Is  a  local  net 
— TTiatf  room  is  an  INM ,  and  each  couTTeroflfenf  is  an  LNH . 

—A  folder  is  a  data  segment  for  transmission  over  the  catenet. 
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—An  oiU  basket  Is  a  SEND  adl,  and  an  in  basket  is  a  RECV  call.  Delhery  slips 
are  SEND  /RECV  call  parameters. 

—  Eadi  piece  of  company  mail  Is  a  datagram  if  it  contains  a  complete  segment,  and  a 
(datagram)  fragment  otherwise.  (For  convenience,  we  consider  unfragmented 
datagrams  to  be  "fragments"  as  well.) 

—  Transfer  mail  rooms  are  gateways.  , 

—Finally,  a  courier  mail  envelope  is  of  course  a  local  net  packet. 

(End  of  postal  terminology,  and  resumption  of  Postal  terminology.) 


1.3.  The  TCM  INM  Relationship 

The  manner  by  which  the  TCM  communicates  with  the  INM  is  not  standardized.  However, 
the  IP  manual  [5]  illustrates  one  possible  implementation  through  a  pair  of  procedure  calls 
SEND  and  RECV. 

The  sending  TCM  issues  an  INM  call  of  the  form 

SEND(src.  dst, .... BeiPTR,  leu, ...  ) 

when  it  wishes  to  send  a  segment  to  a  destination  Host.  Parameters  sre  and  dst  give  the 
Internet  addresses  of  the  source  Host  (presumably  itself)  and  destination  Host,  respectively. 
Internet  addresses  are  simply  the  concatenation  of  a  net  number  and  a  Host  nuiaber.  The 
segment  to  be  transmitted  is  of  length  len  (in  8-bit  bytes,  or  "octets"),  and  may  be  found  in 
memory  location  BufPTR.  (Omitted  parameters  will  be  discussed  in  section  2.1.) 

If  all  goes  well,  this  segment  will  be  presented  in  due  course  to  the  TCM  at  the  destination 
Host.  It  takes  delivery  of  the  incoming  segment  by  completing  a  mating  RECV  call  on  Its 
INM .  which  we  assume  was  awaiting  its  arrival: 

RECV(BufPTR, ....  sre,  dst, ....  len.  ), 

where  sre,  dst,  and  len  are  value-returning  ("OUT")  parameters,  and  BufPTR  provides  a 
pointer  to  a  preallocated  segment  buffe-  In  the  receiving  TCM.  Although  dst  Is  an  OUT 
parameter,  we  may  assume  that  all  segments  delivered  will  have  dst  equal  to  the  Host's 
Internet  address.  Note  that  all  through  traffic  at  a  gateway  is  handled  by  its  INM  without 
Involvement  with  the  Host's  higher  level  protocols  (i.e.  without  TCM  SEND /RECV 
handling). 

The  TCM,  for  its  part,  implements  several  higher-level  aspects  of  the  internet 
communication  process: 

—reliability  (e.g.  acknowledgements  and  retransmissions); 

—error  control  at  the  segment  level  (i.e.  checksumming  TCP  headers,  etc); 

—flow  control  (controlling  the  rate  at  which  segments  are  delivered  to  the  IN  M ); 
—multiplexing  (management  of  multi-purpose  segments); 

—connections  (reserved  portions  of  transmission  capacity),  and 

—precedence  and  security  (managing  degrees  of  urgency  and  confidentiality  of 
segments). 


1.4.  The  INM  LNM  Relationship 

The  Interface  between  the  INM  and  LNM  is  not  specified  in  [5].  One  may  speculate, 
however,  that  It  could  follow  the  general  form  of  the  SEND /RECV  calls  at  the  TCM-INM 
interface. 

That  is,  when  an  INM  has  a  fragment  to  send  out  on  a  local  net,  it  issues  a  SEND  call  in  the 
net's  LNM  as  follows: 
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SE  ND  (src—  In,  dat—  In,  FBufPT  R ,  Flen) 

Parameters  src..  In  and  dst_  In  give  the  numbers  of  the  sending  and  target  Hosts  on  ^s  net. 
Recall  dsU  In  will  designate  either  this  fragment's  Internet  destination  Host,  or  the  Host 
serving  as  its  next  gateway.  FBufPTR  and  Fleii  indicate  the  memory  location  and  extent  of 
the  fragment  constructed  by  the  IN  M  . 


Delivery  of  local  net  packets  by  LNM  s  at  target  Hosts  is  accomplished  by  compleUon  of  an 
INM  call  (which  again  we  assume  is  waiting)  of  the  form: 

RE CV (FBufPTR,  ...,src_  In,  dat_  In,  Flen), 

where  src_  In.  dst_  In,  and  Flen  are  OUT  parameters  serving  the  obvious  functions 


It  is  useful  to  note  the  communication  functions  provided  by  LNM  s: 
—packet  formation  and  transmission: 

—local  net  status  control; 

—routing  of  packets  within  each  local  net. 


2.  A  CloserLook  at  BP  Fimciionality 


o  1  *1*0  loiLcif &C0 

’I'he  full  parameterization  of  the  SEND/RECV  calls  at  the  TCM-INM  interface  is  as 
follows. 

SEND  (src,  dst,  prot,TOS,TTL,  BufPTR,  len.  Id,  DF.  opt,  OUT  result) 

—src,  dst  Internet  source  and  destination  addresses. 

-prot  the  next  level  protocol  in  effect  (e.g.  at  the  TCM  level).  Several  of  these  have 
already  been  assigned  (see  [7]):  TCP,  for  instance,  has  assigned  number  6 

— TOS:  type  of  service  (normal,  high  throughput,  etc.)  requested  by  the  TCM  . 

— TTL:  time  to  live,  a  time  (in  seconds)  after  which  the  datag'am  derived  from  this 
segment  can  "self-destruct"  if  not  delivered  (see  section  2.5). 

—BufPTR,  len:  TCM  segment  pointers. 

-Id:  segment  identification  tag,  for  reassembling  fragments  derived  from  this 
segment  (see  section  2.5). 

— DF:  a  "don't  fragment"  switch. 

-opfc  options  to  be  observed  in  transmitting  the  segment  (see  section  2.6). 

-result  an  OUT  parameter  in  ^OK.  error};  OK  =  "datagram  sent  ok";  r-ror  = 

"error  in  arguments,  or  local  network  error". 

The  corresponding  RECV  call  issued  by  the  TCM  at  the  destination  Host  has  a  similar 
parameterization: 


RECV  (BufPTR,  prot, 

OUT  result  OUT  src. 


OUT  dst  OUT  TOS,  OUT  len,  OUT  opt) 


The  purpose  of  these  parameters  should  be  evident  from  consideration  of  the  corresponding 
SEND  parameters.  Note,  however,  that  two  are  IN  (read-only): 

—BufPTR:  a  pointer  to  buffer  preallocated  by  the  TCM  for  receipt  of  the  incoming 
segment. 

-prot  an  indication  of  which  higher  level  protocol  version  this  RECV  call  can 
aocxrmmodate. 
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2,2.  Datagram  Foimatling 

1  sedion  1.1,  a  fruitful  way  of  looking  at  protocol  layering  is  to  consider  the 

^wels  of  envelope  nesting  that  surrounds  the  raw  data  transmitted.  This  is  illustrated  in  fig. 
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Figure  2—1:  Data  enveloping. 
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Many  of  these  fields  are  directly  transferred  from  corresponding  SEND  parameters 
However,  a  few  bear  darificaUon:  r  s  p.* 

-V er  version  of  the  IP  header  layout. 

-IHL:  total  header  length,  in  multiples  of  4  octets  (32  bit  words). 

®  one-octet  encoding  of  the  type  of  service  which  the  datagram 
should  be  given  en  route  to  its  destination.  (This  encoding  is  apt  to  be  mapped  to 
other  representations  as  the  datagram  moves  first  to  the  local  net  level  and  then 
to  other  networks  en  route  to  the  destinatin  network.) 

—Total  length:  total  length  of  the  datagram,  in  octets. 

-Fig:  three  bits  ^bjbg,  where  bg  must  be  zero,  b,=  1  iff  the  datagram  should  not  be 
fragmented,  and  b2=  1  iff  this  fragment  is  not  the  final  one  of  its  data.'jram. 

—Fragment  Offset  gives  the  position  of  this  fragment’s  message  data  within  its 
OTgiiml  segment,  in  umts  of  8  octets  (64  bits).  The  first  fragment  of  a  datagram 
has  offset  zero.  b  ‘**44 
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—Header  checksum;  from  [5],  p.  14: 

The  checksum  field  is  the  16  bit  one's  complement  of  the  one’s  complement  sum  of  all  16 
bit  words  in  the  header.  For  purposes  of  computing  the  checksum,  the  value  of 
the  checksum  field  is  zero." 


2.3.  The  Internet  Addressing  Phnction 

Internet  addresses  actually  have  three  formats,  providing  for  a  few  nets  with  relatively 
many  Hosts,  and  many  nets  with  relatively  few  Hosts.  These  formats  are: 

-Class  A  :  a  leadO,  followed  by  a  7-bit  net  name,  followed  by  a  21-bit  Host  name. 

-Class  B:  a  lead  10,  followed  by  a  14-bit  net  name,  followed  by  a  16-bit  Host  name 

-Class  C:  a  lead  110,  followed  by  a  21-bit  net  name,  followed  by  an  8-bit  Host 
name. 

Several  Class  A  network  names  have  already  been  assigned  [7], 

As  mentioned  in  section  1.2,  the  INM  addressing  function  deals  only  with  outgoing 
datagrams,  and  amounts  to  picking  the  target  Host  on  the  next  local  net.  This  will  involve  use 
of: 

1.  A  gateway  table,  which  will  need  to  be  updated  periodically  to  reflect  long  term 
additions  and  deletions  of  nets  to  the  Internet  system,  as  well  as  shorter  term 
changes  in  gateway  availabilities. 

2.  Use  of  specific  routing  instructions,  as  given  in  the  datagram  options  (see  section 


2.4.  FYag  mentation 

Fragmentation  occurs  on  outgoing  datagrams  which  will  not  fit  into  a  single  local  net 
pacdcet.  Note  that  fragment  headers  can  be  constructed  without  examination  of  the  data 
segment  to  be  transmitted.  This  means  that  a  buffer  the  size  of  a  local  net  packet  could  suffice 
for  fragmentation  if  space  is  at  a  premium.  The  IP  specification  [5]  gives  an  example 
fragmentation  procedure  (p.  26). 


2.5.  Reass  eml>ly 

The  IP  specification  also  gives  an  illustrative  reassembly  algorithm  (p.  28).  The  key  points 
from  our  perspective  are  the  following; 

—Reassembly  is  done  only  at  Internet  destinations,  and  not  at  gateways  or  other 
intermediate  Hosts  (since  we  cannot  be  sure  all  fragments  derived  from  a  given 
datagram  will  follow  the  same  routing). 

-Datagram  fragments  are  reunited  on  the  basis  of  a  key  formed  from  four  fields  of 
the  fragment  headers:  source,  destination,  protocol,  and  identification.  Sending 
TCMs  must  choose  identification  fields  such  that  this  4-tuple  is  unique 
throughout  the  Internet  system  for  the  lifetime  of  a  datagram. 

—Strangely  enough,  fragment  headers  do  not  include  the  overall  size  of  a 
(reassembled)  datagram.  Hence  preallocation  of  a  cxjmplete  buffer  for  each 
incoming  datagram  is  not  generally  feasible,  unless  either  a  small  limit  is  imposed 
on  incxjming  datagram  size,  or  the  datagrarh  arrival  rate  is  assumed  to  be  low. 

—Various  anomalies  can  occar  in  the  arrival  of  fragments,  e.g.  duplic^ticms, 
reorderings,  and  omissions.  The  INM  is  free  to  handle  these  however  it  wishes, 
except  that  fragments  with  headers  that  fail  the  checksum  test  must  be  destrcwed. 
Fragments  are  "aged"  by  decrementing  their  TTL  field  as  they  pass  through  the 
Internet  system.  Each  INM  handling  a  fragment  charges  its  processing  time,  with 
a  minimum  of  one  (second)  each.  Presumably,  the  TTL  for  a  datagram  under 
reassembly  is  the  minimum  of  the  TTLs  for  its  delivered  fragments  W  hen  this 
TTL  reaches  zero,  the  partially  formed  datagram  is  destre^ed,  and  the  buffer  is 
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2.6.  Options 

0  ptiuns  indicate  special  handling  for  datagrams,  as  requested  by  the  sending  TCM .  The  tise 
of  options  is  optional,  but  their  implementation  is  mandatoiy. 

The  essential  options  are  summarized  below,  omitting  "null-options"  sudi  as  no-ops, 
padding,  etc  An  asterisk  indicates  that  the  option  is  copied  in  every  derived  fragment. 

—  •Security:  for  sending  "sec’irity,  oompartmentalization,  handling  restrictions,  and 
TCC  (closed  user  group)  parameters". 

—  •Loose  Source  and  Record  Route  (LSRR):  for  specifying  a  series  of  internet 
addresses  through  which  a  datagram  is  to  be  routed.  The  routine  is  loose  because 
"the  gateway  or  Host  IP  is  allowed  to  use  any  route  of  any  number  of  other 
intermediate  gateways  to  reach  the  next  address  in  the  route".  The  route  is 
recorded  in  the  sense  that  a  pointer  packaged  as  part  of  the  option  is  advanced  as 
each  intermediate  address  is  reached. 

—  •Strict  Source  and  Record  Route  (SSRR);  similar  to  LSRR,  except  that  "the 
gateway  or  Host  IP  must  send  the  datagram  directly  to  the  next  address  in  the 
source  route  through  only  the  directly  connected  network  indicated  in  the  next 
address  to  reach  the  next  gateway  or  H ost  specified  in  the  route." 

—Record  Route;  requires  each  INM  handling  the  fragment  to  concatenate  its 
address  into  the  space  allocated  for  this  option  (if  sufficient  space  remains). 

-•Stream  Identifier  "provides  a  way  for  the  16-bit  SATNET  stream  identifier  to 
be  carried  through  networks  that  do  not  support  the  stream  concept." 

—Internet  Timestamp:  indicates  that  each  INM  handling  the  fragment  should 
concatenate  its  time  of  receipt  (in  milliseconds  since  midnight  UT)  into  the  space 
allocated  for  this  option. 


2.7.  Internet  Control  Message  Protocol  (ICMP) 

The  IN  M  must  implement  special  protocol  that  is  companion  to  the  IP  for  reporting  errors  in 
datagram  transmission  and  requesting  special  INM  services.  This  protocol,  termed  the  ICMP 
[6],  is  mandated  as  follows; 

"ICH  P  uses  the  basic  support  of  IP  as  if  it  were  a  higher  level  protocol,  however,  ICH  P  is 

actually  an  integral  part  of  IP,  and  must  be  implemented  in  every  IP  module." 

ICMP  datagrams  may  be  recognized  by  INMs  through  the  spedal  prot^  1  header 
indication.  For  obvious  reasons,  ICM  P  datagrams  are  not  sent  regarding  errors  in  delivering 
ICMP  datagrams.  Briefly,  their  varieties  are  as  follows: 

1.  Destination  unreachable:  a  receiving  gateway  could  not  transfer  a  datagram, 
or  a  don’t  fragment  request  could  not  be  honored. 

2.  Time  exceeded;  a  first  fragment,  or  unfragmented  datagram,  was 
superannuated. 

3.  Parameter  problem;  a  datagram  header  was  found  to  be  malformed. 

4.  Source  quench;  a  destination  Host  requests  a  slower  rate  of  transmission  from  a 
source  H  ost. 

5.  Redirect  a  gateway  advises  a  Host  not  to  route  traffic  to  a  particular  distant  net 
through  it. 

6.  Echo  or  echo  reply:  used  to  "reflect"  datagrams  bade  from  destinations  to 
sources,  for  testing  purposes. 

7.  Timestamp  or  timestamp  reply:  similar  to  echo  and  e^lio  reply,  but  with  a 
destination  timestamp. 

8.  Information  or  information  reply:  used  for  querying  "what  network  is  this?". 
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3.  Current  Design 

W  e  summarize  here  the  principal  features  of  the  AtoS  approach  to  implementing  the  INM , 
as  well  as  remarks  on  the  current  status  of  that  implementation. 


3.1.  Major  Design  Decisions 

There  have  been  two  major  design  decisions  thus  far. 

1.  The  first  is  to  split  the  INM  into  three  submodules:  an  INM_  OUT  dealing  with 
traffic  outbound  on  a  given  local  net,  an  INM_  IN  similarly  handling  inbound 
traffic,  and  an  INM_  SRV  tying  them  together  and  interfacing  to  the  Host(s). 
W e  envision  one  INM_  IN  and  INM_  OUT  pair  for  eadi  local  net  interface,  but 
only  one  INM  _  SRV  per  INM . 

2.  The  second  decision  is  to  use  a  two-phase  A  da  rendezvous  to  implement  both  the 
upper  CTCM)  and  lower  level  (LNM)  interfaces.  In  each  case,  a  task  call  is 
performed  by  the  initiator  of  the  data  transfer  action,  with  the  receiver  servicing 
the  transfer  through  an  appropriate  entry.  W  hen  the  data  transferred  has  been 
fully  processed,  a  reciprocal  rendezvous  takes  place  (with  caU  and  entry  roles 
reversed)  to  report  the  success  or  failure  of  that  processing.  [An  alternative 
formulation,  based  on  passing  messages  via  ports  such  as  is  done  in  the  i432 
architecture,  is  also  under  consideration.] 

Division  of  functional  responsibilities: 

1.  INM_SRV: 

a.  Receive  segments  from  and  deliver  segments  to  TCMs  in  the  Host(s) 
served. 

b.  A  ccept  incoming  segments  from  the  INM  _  INs,  and 

i.  deliver  via  local  Host  RECV  calls  all  segments  so  addressed,  and 

ii.  (if  implementing  a  gateway)  route  to  appropriate  INM_  OUTs  all 
through  traffic 

c.  Maintain  a  gateway  transfer  table,  used  to  route  all  outbound  segments 
(whether  from  a  local  Host  or  neighboring  INH_  IN).  If  an  outbound 
segment  has  a  non-local  net  name  in  its  destination  address,  that  net 
name  is  used  as  a  key  to  select  the  appropriate  next  gateway  directly 
reachable  by  a  local  net  served. 

d.  Implement  ICMP  message  generation  and  transfer. 

e.  Handle  options: 

i.  Security:  reject  all  classified  traffic  perhaps  with  an  ICMP  report 
of  "destination  unreadiable". 

ii.  LSRR,  5SRR,  and  record  route. 

iii.  Timestamping:  (note  this  requires  e  time  of  day  service, 
presumably  from  the  TCM ). 

[Note  that  all  message  traffic  through  the  INM_  SRV  is  in  segment  form; 
datagram  (or  fragment)  form  is  used  solely  within  INM_  IN  and  INM_  OUT 
submodules.] 

2.  INM_  OUT: 

a.  Form  fragments  from  segments  received  from  INM  _  SRV. 

b.  Deliver  fragments  to  the  LNM_  OUT  of  its  assigned  local  net,  along  with 
their  local  net  addresses  (final  or  gateway),  as  provided  by  INM  _  SRV. 

c.  Map  the  Internet  type  of  service  parameter  to  an  appropriate  local  net 
type  of  service,  or  reject  fragment  if  this  is  not  possible. 

3.  INH_  IN: 

a.  Receive  fragments  from  the  LNM  _  IN  of  its  assigned  local  net. 


t 


b  z. « 


Ada  Specifications  for  the  Dod  hiteinet  Protocol; 

The  INH_  OUT  Submodule  Report  No  1 

*  page  11 

^  fragments  into  complete  datagrams  (destination  fragments 

c.  Delete  overage  and  erroneous  fragments  (note  this  requires  a  timing  pulse  ' 
at  least  once  each  second). 


I 


H 


LO 
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4.  Ada  Specifications  forINM-  OUT  A  Road  Map 
The  INM  _  OUT  module,  whose  functionality  is  described  in  the  preceding  section,  has  been 
specified  in  full  in  Ada  code.  The  purpose  of  this  section  is  to  review  the  structural 
organization  of  this  code  as  a  set  of  interrelated  A  da  packages,  embedded  tasks,  and  auxiliary 
procedures.  The  code  itself  is  listed  in  the  Appendix  as  a  series  of  14  separate  compilation 
units. 


4.1.  Communication  betireen  INM_  OUT  and  its  "neighboi"  modules 
To  better  understand  the  code  organization,  it  is  useful  first  to  visualize  the  communication 
channels  that  are  assumed  to  exist  between  INM —OUT  and  other  modules  [l].  These 
channels  suggest  the  important  intertask  communication  of  the  A  da  code  to  be  described 
Recognition  of  these  channels  determines  the  gross  organization  of  the  code  that  embodies  this 
modular  organization.  Figure  4—1  shows  the  channels  not  only  between  INM_  OUT  and  its 
neighbors",  but  also  identifies  two  other  important  channels  that  are  assumed  to  exist;  the 
latter,  however,  are  not  detailed  within  the  code  to  be  described. 


I  INM_SRV  I  <**•*•••••«««•«» j I  MEMORY 

I  I  I 


LNM_OUT 


Figure  4—1;  Communication  channels  (tasking  interfaces)  between  INM  _  OUT 
and  its  "neighbor  modules".  Directed  arcs  indicate  direction 
o^  intertask  requests  (Ada  entry  calls^  Arcs  composed  of 
asterisks  (*)  represent  assumed  communication  channels  that 
are  not  now  modeled  in  the  Ada  code. 


Discussion  in  the  preceding  section  has  already  explained  the  role  of  the  INH_  SRV  and 
LNH-OUT  modules.  The  module  marked  "MEMORY"  is,  depending  on  the  spedHc 
implementation,  either  a  memory  to  which  INM  _  SRV  and  INM  -  OUT  have  shared  access  or 
a  control  unit  that  governs  access  to  some  such  memory  unit.  The  module  marked  FIFO  is 
assumed  to  be  a  hardware  unit  functioning  as  a  first-in-first-out  queue.  Outbound  datagram 
fragments  are  passed  through  the  FIFO  module  to  LNM  -  OUT.  The  FIFO  must  be  capable  of 
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nnlfjr  (maximum-sized)  datagram  fragment.  The  module  is  assumed  to  operate 

*» 

Figure  4-1  indicate  direction  of  intermodule  requests,  whidi  are 
inbound  entry  calls  can  specify  transmission  of  both 

pnt^  f  Completion  of  the  rendezvous  initiated  by  an  Ada  task 

from  the  f sending  data  to  the  callee  and  receiving  ^ta 

^  though  such  a  transaction"  may  always  be  initiated  by  a  pa^cular 

fesuH  of  a  directions,  as  a 

we?M  t^FTF^^p^  requests  to  LNM_  OUT  as 

rnfirr^H  MEMORY.  Depending  on  the  nature  of  these  requests,  message 

'™  or  ,„  bou.  dir.cu<»,.  d^H= 

Requests  from  IN M_  SRV  to  INM  _  OUT  are  of  two  kinds; 

^he  pu^ose  of  providing  INM  _  OUT  with  initialization 
mformation.  An  initialization  request  is  a  message  that  supplies  INM _  OUT 

values  acquire,  via  MEMORY,  the  actual  initialization 

^  A  transmission  request 

HFMf^Y  ♦r*'"!®''  INM -OUT  can  use  to  locate  and  acquire,  via 

MtMUKY,  the  actual  datagram  prepared  (or  transshipped)  by  INM_  SRV. 

A  message  request  from  INM_OUT  to  MEMORY  may  either  supply  MEMORY  with  a 

pointervalue  or  receive  from  MEMORY  a  data  value.  uppiy  mi;.m  uitr  wiui  a 

unft  n/lT^r^V"  to  the  FIFO 

Iniior  Vine  ~  •  Channel  to  LNM  _  OUT  to  issue  requests  for  confirmation  that  the 

and  INM  OUT^, “  ®  manner,  the  channel  from  INM_  SRV 

IS  used  by  the  former  to  obtain  confirmation  that  the  latter  has  correctly 
processed  the  preceding  request.  ^ 

memory,  while  important  to  the  operation  of 
INM  —  SRV.  are  not  relevant  to  the  current  discussion. 


structure  of  the  corresponding  Ada  code 

n  mpirnno  modulcs  discussed  in  oonn«ction  with  Figure  4-1  is  modelled  by 

a  package,  the  pnncipal  one  for  our  purposes  being  the  package  for  INM  OUT  which  Is 

fnm  0  S  spedfi^tion  part "  and  the  "body  pJrt  if 

**°dule  have  been  coded.  By  contrast,  it  is  only  necessary  for  our  purposec  to 

thp  LNM-  OUT.  FIFO,  and  MEMORY  modules;  siiiremily 

mu  ^  are  relevant  in  the  design  of  INM  -  OUT.  By  similar  reasoning,  sines 

n~ih  issues  entry  calls  into  INM  -  OUT  and  not  vice  versa,  it  is  unnecessary  to  consider 

Tnm^  this  reason,  there  is  no  padcage  representing 

INM  -  SRV  in  the  code  section  displayed  in  this  report.  f  s  «eimng 


4.3.  Definition  packages 

The  full  Ada  code  for  INM  -  OUT.  in  the  form  of  an  Ada  package,  has  been  deliberately 

Wera'SS*  oMhre"^  f ff.'^tored  out;  the  factored  information  takes  the 
th'^e  (auxiliary)  definition  packages.  These  packages  contain  type 
information  (type  and  subtype  declarations  and  their  corresponding  representation  clauses  if 
any)  as  well  as  renstant  information  (constant  dedai^tions);  these  dedarSoS^ar^^^^ 

In  n  Thus,  the  "root"  definition  padcage  Is  named 

In-  Out-  Snr-  Defs.  because  the  contained  dedarative  Information  Is  common  to  all  three 
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parts  of  the  Internet  Module;  the  subsidiary  package  Inm_  In_  Out_  Defa  contains 
declarative  information  common  to  both  INM_IN  and  INM_OUT  and  depends  on  the 
dedarative  information  in  In_  Out- Srv_  Defs.  Finally,  the  definition  package  named 
Inm_  Out-  Defs  contains  dedarative  information  of  relevance  only  to  INM  _  OUT  and  to  the 
modules  (LNH  _  OUT,  MEMORY,  and  FIFO)  to  whidi  it  makes  requests  for  service.  Figure 
4-2  shows  the  full  dependency  graph  that  has  resulted  from  this  dedsion  to  factor  out  common 
dedarative  information.  The  graph  also  reveals  that  the  packages  representing  MEMORY, 
FIFO,  and  LNM  _  OUT  modules  have  also  been  spedfied  to  depend  on  certain  of  the  definition 
padcages. 


I  I n_Out_Sr v_Def s  I 
I _ I 

ys  ys  ys  y\ 


I 


I  1 nm_l n_0ut_Def s 
I _ 


I  1 nm_Out_Def s  I 


I 


I  Memory_Modu 1 e  I 


I.  Local_Net_  I 
I  Module  I 


I 

I  F  lfo_Module 


I 


I 


I  I  I nm_Out_Modu 1 e  I  I 


Figure  4r-2:  Graph  illustrating  the  dependence  of  the  module  packages  on 
certain  auxiliary  definition  padcages. 
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4.4.  Tasks  defined  within  the  Tnnri—  Out_  Module  package 
Three  tasks  are  declared  within  the  Inm_  Out_  M  odulc  package. 

1.  The  main  task,  named  I nm_  Out,  interfaces  with  INM_SRV  and  with 
LNM  —  OUT  such  that  a  pipeline  effect  is  achieved  for  speeding  datagrams  along 
the  outbound  data  path:  Host  module  — INM_SRV  — INM_OUT  — > 
LNM-  OUT. 

2.  An  auxiliary  (server)  task,  named  Read—  Init—  Parameters,  which  obtains  from 
host— related  memory  the  initial  parameter  values  needed  to  perform  datagram 
transmission. 

3.  An  auxiliary  task  named  Translate—  TOS—  Task,  which  operates  in  parallel  with 
INM— OUT,  the  main  task,  by  translating  type-of-servioe  information  from 
host-level  to  local-net  level  encoding. 

The  specifications  for  these  three  tasks  are  found  in  the  specification  part  of 
Inm-  Out-  M pdule.  The  body  parts  of  these  three  tasks  are  represented  as  stubs  in  the  body 
part  of  Inm—  Out—  Module  and  the  actual  body  parts  of  these  tasks  are  listed  in  separate 
compilation  units.  (Sec  Figure  4-3.) 


I  nni_Out_Modu  1  e 


I  nni_Out 

I  The  main  task 


Read_I  n  1t_Paranieter  s 
I  Aux 1 1 lary  task 


Trans  1 ate_TOS_Task 
I  Aux 1 1 lary  task 


Figure  4-8:  The  three  tasks  embedded  in  Inm-  Out-  H  odule. 


4.5.  bnportant  local  procedures  of  Inm,  Out-  Module 

Acitivity  initiated  within  the  main  task  (Inm-  Out)  is  delegated  in  two  ways:  (a)  by  entry 
calls  to  Read-  Init_  Parameters,  and  (b)  by  calls  to  one  "prindpal"  procedure  defined  in  the 
body  part  of  the  containing  package  (Inm-  Out-  M  odule).  This  procedure  is:  Do-send,  which 
in  turn  issues  calls  on  other  three  others  procedures,  locally  define  (in  Do-  send.  These  are. 
Read-  in-  header.  Compact-  Options  and  Send-  fragment.  The  respective  purpose  of  each  of 
these  principal  and  subsidiary  procedures  is  spelled  out  in  the  commentary  of  their  respective 
specification  parts  whidi  are  found  in  the  specification  part  of  Do_  send.  The  body  parts  of 
these  procedures  are  represented  as  stubs  in  the  body  part  of  Do-  send  and  appear  as  separate 
compilations  units  in  the  listed  code. 
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4.6.  Section  summary 

This  ends  our  short  description,  or  "road  map"  through  the  code  proper.  There  ore  14 
separate  compilation  units  given  in  the  Appendix.  These  are; 


1. 

In_Out_Srv._Defs 

— 

2. 

Inm_I n_0ut_Defs 

— 

3. 

I nm_0ut  JDef s 

— 

4. 

Memory_Modu!e 

— 

5. 

Fi fo_nodule 

— 

6. 

Loca l_Net_Modu 1 e 

— 

7. 

Inm_0utJ1odule 

— 

8. 

1 nm_0ut 

— 

3. 

Read_Ini t_Parameters 

— 

10. 

Trans  1 ate_TOS_Task 

— 

11. 

Do_send 

— 

12. 

Read_i n„header 

— 

13. 

Compact_opt i ons 

— 

14. 

Send_fragment 

— 

Top-level  definition  package. 

Second- level  definition  package. 

Second-level  definition  package. 

Auxiliary  module  package. 

Auxiliary  module  package. 

Auxiliary  module  package. 

The  main  package. 

The  main  task. 

Auxiliary  task  used  by  the 
main  task,  Inm_0ut. 

Auxiliary  task  used  by  the 
procedure  Read_i n_header. 

Procedure  local  to  I nm_0ut_f1odu I e 
called  by  Inm_0ut. 

Procedure  local  to  I nm_0ut_Modu I e. 
cal  led  by  Do_send. 

Procedure  local  to  I nm_0ut_f1odu I e 
called  by  Do_eend. 

Procedure  local  to  I nm_0ut_f1odu I e 
called  by  Do_eend. 

I 
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flda-t o-S M i CO n  Projict 
Uni  von  i  ty  of  Utahi 

DoD  Intirnot  Protocol  INfl.OUT  oubBodMli 

flda  coda  for  tho  top-lival  dafinition  packaya  naaadi 
In-  Out-  Srr-  Defa 


““  Varalon  of  Novaabar  1,  1982  __ 

package  In_0u t_Sp v_Da f a  ia 
--  Function! 

--  INn*SRv'lod!!lac"**'"*  by  tha  INH.IN,  INH.OUT,  and 


—  Uaafui  bit-fiaid  typaa. 


aubtTpe  b  i  1 1  ia 
aubtjpe  bits  is 
aubtjpe  b  i  1 4  ia 
aub^pe  bits  ia 
aubt/pebitlS  ia 
aubtypebitlG  ia 
aubtjpe  bi  t21  ia 
aubtype  bl  t24  ia 
aubtype  bi  t32  ia 


I n  tagar  range  8. .  1 ; 

I n  tagar  range  8.  .  7 ; 
intagar  range  8. .15; 
intagar  range  8. .255; 
intagar  range  8. . 8191; 
intagar  range  8. . 65535 ; 
intagar  range  8 .. 2897151 ; 
Intagar  range  8. . 16777215; 
intagar  range  8. . 4294967295; 


ah i f tit 
ah i f t3i 
ah  I  f t4i 
ah  I  f t5t 
ah  i  f  t6t 
ah  i  f t8t 
ah  i  f tl3i 
ah i f tl6t 


conatant  t  = 
conatant  i  = 
conatant  t  = 
conatant  i  = 
conatant  t  = 
conatant  t  = 
conatant  t  = 
conatant  t  = 


2; 

8j 

16; 

32; 

64; 

256; 

8192; 

65536; 


aubtype  octat_typa  iabit8; 

type  octat  .buf  far_typa  ia  arrayt  i  n  tagar  range  <»  of  oc  t a t_t ypa ; 


--  Io?k’“'Nor«?l. *''«  unchackad  conv.ralon  routinaa 

•‘“''■S*  <  •"  «h»  1*32  )  for  intagara  that  ara 
iaaa  than  or  aqua  I  to  16  bita  ia  a  ahort  ordinal  (16  bit  flald). 

--  so,  nornaiiy  convarting  a  racord  of  2  bit8  intagara  to  a  b i t 16 
ntagar  uouid  ba  aquivaiant  to  trying  to  atuff  2  ahort  ordinaia 
problla*  ordinal.  Tha  r  apraaan  t  a  t  i  on  apac  i  f  I  ca  t  i  ona  fix  thia 


Rapraaanta t i on  apac I f ica t i ona  aactlon. 


byta  t  constant  intagar  t=  8; 


for  b  I  1 1  ’  a  i  z  a 
for  b i t3’a Iza 
for  b i t4’a Iza 
for  b I t8’a  Iza 
for  bi tl3’tiza 
for  b I tl6’ a i za 
for  bi t21’alza 
for  bl t24’aiza 
for  b I t32 ’ a  i  za 


use  1; 

use  3; 

use  4; 

use  labyta; 

use  labyta  *  5; 

use  2abyta; 

use  2abyts  -f  5; 

use  Sabyta; 

use  4abyts; 


end  ln_Out_Srv_Dafa; 


b  «y 
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**'**■<  ®-S  M  I  con  Project 
University  of  Utahi 

--  “““  Int.rn.t  Protocol  INfl.DUT  eubModul. 

j_:  . . . . . . . 

- Version  of  Noveaber  1,  1982 

with  If'._Dut_Srv_Defe} 

liae  in_Dut_Srv_Defe} 

package  InB_In_Du t_0e f e  » 

—  Function! 

-  -.Hmtione  ue.d  by  both 


-  INfl  date  for  coBsun Icat Ion  «lth  server 


■  ax_header_ I  eng thi 

•ax_e.g«ent_length,  ‘ 

~li  )r  ■-  - . 

■**'-*’»a‘ier_length  -  1, 

■ublype  header  ptr  »  int.«.  ’ 

-P  1.  integer  raitge  h-de  r_bu  f  f  er  _l  ox.addree  e 

^  -•’“♦♦•''-Mgh.addreei; 

■ublype  header.octet.buffer.typ, 

la  “ctet.buf f,r_type(header  ptr) . 

«.bVp.  ' 

raagee  •  •  header.buf f er_hlgh_addraee 

type  t«o_octet_r.cord  is  '  ”  “U-addraee  +  i, 

record 

loi  octet.typej 
hl«  octet.type, 
end  record) 

type  header_buf fer  type  ia 
record  ~ 

vare lont 
IHLt 

typa_of_ttrvlcai 
tota  l_lengthi 
Identification! 

f lags! 

fragaent.of  feet! 

t l■■_1o_l I vt! 
pro  toco  I ! 

hcadtr_chtcl(suB! 
octet.buf fer! 


end  record) 


INfl  data  for  coaaun I ca t  I  on  with  LNfl 


b  I  1 4 ) 
bl  t4) 
bl  tS) 

tHo_octet_record; 
tu®— octet _recordi 
blt3) 
bl  tl3, 
bl  t8) 
b  I  tS) 

tMo_octet_record) 
octet_buffer  typed? 

eource.addreeei  bits?, 
deet  Inat  lon.addrees!  blt32| 


i  l''■t_checl(suB_byto! 
eecond_checl(suB_by  te! 


constant  !  =  1 0  j 

conatanl  !  =  11, 


1/  o  o 
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■  ax.ina.paclcati  comtimt  i  =  126|  —  Octata  (arbitrarg). 

—  ?????????  E.  I. D.  676??? 

■ub^pe  haadar_Morda  ia  intagar  range  5  ..  16)  —  Haadar  langth  In  Hords. 
■ubtype  haadar.octala  ia  Intagar  ran^e  2  8  ..  64)  —  Haadar  langth  In  Horda. 

—  Functional 


functicn  *  x  o  r  '  ( 

f I ra t_oparandi  octat.tgpa) 
aacond_oparandi  octat_tgpa) 
return  octat_tgpa) 

function  *  x  o  r  *  ( 

oparandli  t ho_oc t a t _racord ) 
oparand2i  tuo.cc ta t_racord) 
return  tHO_octat_racord) 


function  Itaak  ( 

nutibar_to_ba_naal(ad_forBali  Intagar) 

Baak.foraal  I  Ir.tagar) 

return  Intagar) 

--  Funct I oni 

Parforat  a  bit  ulaa  RNO  oparatlon  on 

tha  tMo  paaaad  paraaatars  and  raturna  tha  Intagar  raault. 


function  Shi  f  i  _r  I  g  h  t  ( 

nuaiar_to_ba_ahiftadi  intagar) 

Bhift_diatanca:  intagar  range  1  ..  15) 

return  intagar) 

—  Function!  Ooaa  aquivaiant  of  intagar  divida  of  nunbar _t o_b a_ah I f t ad 

—  bg  2  aa  ah  I f t_d I  a tanca  raturning  tha  aquivaiant  of  tha  quotiant 

--  on  unaignad  (poaitiva)  intagara. 

--  Rapr aaan t a t i on  s pac i f I ca t  I  ona  taction. 

for  t.fo_octat_racord  uae 
record 

i 0  at  8  range  8  . .  7 ) 
hi  at  1  range  8  . .  7 ) 
end  record) 

end  InM_In_0ut_0afa ) 


package  body  ln_0ut_0afa  la 

function  *  x  o  r  *  ( 

f i r a  t.oparandi  b I 1 6) 
aacond.oparandi  bitS) 
return  b  i  1 6 

--  Func  t I oni 

Raturna  tha  Exciuaiva  OR  of  two  octatc. 

Tha  foiioHing  I  Bp  I aaa n t a t  I  on  aarvaa  at  l  aoftuara  guida  onlu. 

is 

rasuit,  aavaa,  aavabi  bitS) 
ab I t ,  bb I t 1  bitS) 

begin 

aavaa  i=  f Irat.oparand) 
tavab  1=  aacond.op arand ) 
r aau it  i  =  8) 

—  Initialization. 

for  indaxin8..7 
loop 


r 
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•  bit  1=  aavaa  remahlftl; 

bbit  1=  aavab  remahlftl; 

aavaa  :  =  Sh I f t_r I gh t (aavaa ,  1); 

aavab  i=  Sh  I  f t_r  I ght (aavab,  1); 

if  not  ab  It  =  bb  1 1  then 

raault  i=  raault  +  ahlftl  aa  Index; 
exJ  if; 
end  loop; 

return  raault; 

end; 

function  “  x  o  r  *  < 

o;)arandli  t uo_oc  te t _r acord ; 
oparandli  t uo_oc ta t .record ) 
return  tHo_octat_racord 

—  Function) 

—  Forna  the  axclualva  OR  for  corraapond I ng  octata  of  tuo 
—  t uo_oc t a t .operands .  Uaea  above  declared  *xor*  function. 

I  hopa  thTa  la  legal  flda.  (G« 'yi  Plaaaa  check).  Ua  uaa 
--  thia  function  uhan  parfornlng  check aunn I n g  on  the  full  16-b  I  t 
chackauaia  uh  I  ch  are  rapraaantad  aa  tuo.oc  tat.racorda. 
is 

rasuld)  t uo_o  c t a t .record ; 
begin 

raault. lo  i-  oparandl.lo  xor  oparand2 .  I  o ; 
raault. hi  i=  oparandl.hl  zor  oparand2 .  h I ; 
return  raault; 
end; 
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—  Gat  the  leaat 
--  a  I  gn I f I  can  t  bit. 

--  Gat  the  laaat 
--  aignificant  bit. 

—  Str Ip  off  the  laast 
—  aignificant  bit. 

—  Strip  off  the  laaat 
--  significant  bit. 

—  Add  tha  currant  xor  bits 
--  to  the  rasu  1 1 . 


function  flask  ( 

nuBbar  _t o.ba.masksd.f oraa It  Integer; 
naak  .fornalt  Integer) 

return  Integer 

lha  follouing  I  Bp  I aman ta t I  on  aarvaa  aa  a  aoftuara  guide  only. 

is 

fIrst.nuBbar  t  Intagar; 
sscond.numbsr  :  Integer; 
rasu It  )  I  ntsgsr ; 

Index  )  Integer; 

Baaklng.dons  i  boolean; 

begin 

--  Initialize  variables. 

f I  re : .number  i=  nuBbar.to.BSSk.f orna I  ; 
sscon d.nuBbar  i=  Bask. formal; 
r  aau It  i  =  S ; 

I ndax  I  =  6 ; 

Baaklng.dons  i=  falsa; 

—  Do  a  bit  by  bit  AND  of  both  nunbara  starting  froB  tha 

—  lou  order  bit. 

while  not  Bask  Ing.dona 
loop 

—  Taat  to  aaa  If  both  lou  order  bits. 

if  <f  Irat.nuBbar  rem  2)  =  1  and  (sacond.nuBbar  rem2)  =  1  then 

--  Add  the  currant  bit  Into  tha  result, 
raault  i=  result  +2  aa  Index; 
end  if; 
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—  Taka  off  tha  low  ordar  bit  froti  both  nunbara. 

first_nuBbar  :=  Sh  I  f t _r I gh t ( f i ra t _nuBbar ,  1)| 

aacond_nunbar  t=  Sh I f t _r I gh t (aacond_nuBbar  ,  2)} 

—  If  aithar  nuBbar  la  zaro  than  Ha  ara  dona. 

if  ( f  I  ra t _nuBbar  =  9)  or  (aacond.nuBbar  =  6)  then 
Baaklng_dona  i=  trua} 
elae  —  IncraBant  Indax 
Indax  t=  Indax  +  1| 
end  if; 
end  loop; 

return  r  a  a  u  1 1 ; 
end  Baak ; 

function  Shlft_rlght( 

numbar_to_ba_Bhl ftad:  Intagar; 
sh I f t _d I  a tanca t  Intagar  range  1  ..  15) 

return  Intagar 

—  Tha  follouing  I  op  I aBan t a t I  on  aarvaa  aa  a  aoftuara  gulda  only. 

is 

begin 

return  n uHhb r_t o_ba_ah I f t ad  /  ah  I f t_d I  a tanca ; 
end  Shift _rlght; 
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end  ln_0ut_0afa; 


Abe 
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Bd«-to-S 1  I  Icon  Ppojoct 
Un I v«p» 1 1 y  of  U  t  ah  I 

DoO  Intopnat  Ppotocol  INH_0UT  subaodul# 

nda  cod#  fop  tha  I  n  t  apiiad  I  a  t  a  -  I  a  v#  I  dafinitlon  packaga  naaad: 

Inm—  0  ut_  0  efa 

Vapiion  of  Novambap  1,  1982 


with  In_Out_Srv_Dafi,  InBi_In_0u  t  _0a  f  s  ; 
use  In_Out_Spv_Daf a ,  InB_In_0u t _Da f a  j 
package  Inn_Out_Dafa  is 


--  Function: 

Thia  packaga  contain#  dafinitlona  uaad  In 
and  tha  unita'to  which  It  Intapfacaa. 


th'  lNn_0UT  Boduia 


Block  Olagram  of  Anticipated  Hardware  Realization 
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—  Rctual  ilza  dapanda  on 

—  avallabla  apaca  In  tha 

—  harduara  rapr aaan t a t I  on . 
■•x_ I  oca  I _na t _t oa_b y t a_a I za  I  constant  Intagar  «=  2( 

--  tlax  nuabar  of  octata  raquirad 
--  to  rapraaant  tha  local  nat  TOS. 
Ua  aaauna  that  16  bita  la  sora 

—  than  aufficlant  to  ancoda  tha 
--  local  nat  toa. 

--  (Still  not  aura  ua  naad 

—  thia  conatant.  E.I.O.) 

aarly_ficlct  constant  Intagjr  j=  6) 

lata_aclc>  constant  Intagar  j=  1} 


aagnant_loH_addraaa !  constant  t  =  0; 

aagsant_h I gh_addraaai  constant  t =  Bax_aagman t_l angth  -  Ij 


--  Typaa  uaad  for  Intartailc  conaun  I  c  a  1 1  on  > 


X >  constant  Intagar  i  =  4(  —  Data  path  uidthat  chunk  of  addraaa. 

—  SRV  ->  OUT  and  OUT  ->  HEnORY. 


--  CoBBun I ca 1 1  on  batwaan  tha 
--  INH_0UT  and  FIEFIORY  Bodulas. 

subtjpe  c hunk _o f _addr a aB_t ypa  is  Intagar  range  B  ..  2  aa  x  -  1; 

—  Placa  of  atart  addraaa  for  a  datagran. 
—  Each  placa  haa  x  bIta. 

type  BaBory_raquaat_typa  isC 
I oad_addraaa , 
r eca I va_da t uB_oc t a t ) ) 


--  Conaun I ca t I  on  batwaan 

--  INn.SRV  and  INh.OUT  aodulai. 

type  Brv_connand  is( 

I  n I t_l, 

Ini t_2, 

I  n I t_3, 

1  n 1 1_4, 

I  n I t_5, 

I n I t_6 
I n I t_7, 

Band, 
t  aa  t ) ; 

ys  constant  Intagar  s=  4}  —  Odta  path  width: 

—  OUT  ->  SRV. 

^e  out.raaponaa  isC 
Bant_ok, 

dont_fragBant_arror, 
unaupportad.toB, 
bad_haadar , 
bad_Brv_c  oBBand , 
loeal_nBt_t  lBB_out, 

I  oca  I _nB  t  _arr or , 
o  thar ) ; 

—  CoBBun I ca 1 1  on  among  tha 

—  INn_0UT,  LNn_0UT  and  FIFO  Bodulaa. 

a:  constant  :  =  4(  —  Oata  path  width:  OUT  ->  LN. 

^e  loca  l_nat_coBBand_typa  is(ra  ca  I  va_f  r  agaan  t ) ;  —  Currantly  a  lat  of  ena. 


—  Not  currantly  uaad. 

—  Not  currantly  uaad. 

—  Not  currantly  uaad. 


t:  constant  :  =  4(  —  Oata  path  width:  In  ->  out. 


t 


bb« 
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tjpe  i oca  I _na t _r ■ ■ponaa_t y pa  iat 
f  ragaan t „r aca  lvad_olc, 
f ragnan t _no t_r aca Ivad)  | 


ui  constant  I  =  4|  —  Data  path  uidtht  INn_0UT  ->  FIFO. 


type  f I f o„conaand_typa  ist 
r aia  t , 
a  tnra , 
ra  tr I ava) j 

--  Rapraaan ta 1 1  on  ctauaaa. 


for  manory  _raqua  a  t  _t  ypa  use( 
load.addraaa  =  >  S, 

raca  I  va_da  t  un_o  c  t  a  t  =>  1)| 

for  arv.connand  useC 


1  n 1 t  _1 

=  > 

8, 

1 n 1 t_2 

=  > 

1, 

1  n  1  t  _3 

=  > 

2, 

1 n 1 t_4 

=  > 

3, 

1 n 1 t_5 

=  > 

1  ;  1 t_6 

=  > 

5, 

1  n  1  t  _7 

=  > 

B, 

land 

=  > 

7, 

taat 

=  > 

8), 

for  out_raaponaa  useC 

aant.ok  =>S, 

don  t_t  ragaan  t  _arro  r  =>  1, 

unauppor  t  ad_t  08  =>  2, 

bad_headar  =>3, 

bad_arv_coRnand  =>  4, 

local  _n  at  _tlna_out  =>  5, 

I  oca  I _na t_arror  =  >  B, 

0  than  =  >  7) I 


Arbitrary  cholca.  Harduara 
iMplancntara  say  chooaa  tha  ravaraa. 


for  I  oca  I _na t_connand_ty pa  use( 
raca  I  va_f  ragnan  t  =>  S)| 

for  I  oca  I _na t _ra ap onaa_t y pa  use( 
f  ragBant_raca  I  vad_ol(  =  >  S, 
f  ragnan  t_no  t_raca  I  vad  =>  1)| 

for  f I ?o_connand_typa  use( 
raaat  c  >  S, 
atora  =  >  1, 
ratrlava  =>  2)| 


end  Ina_Out_Daf a| 


068 


Ada  SpecificBtiona  for  the  Dod  btcmctProtacot 

The  INM_  OUT  Submodule  Report  No.  1  page  25 


Rda-t o-S i  I  I  con  Projact 
Un I  vara ity  of  Utah) 

DoD  Intarnat  Protocol  INfl.OUT  subaoduia 


Rda  coda  for  tha  auxiliary  packaga  naaad) 
II  emory—  II  odule 

Varaion  of  Novaabar  1,  1982 


with  lna_0u t_0a f a ,  I n_0u t _Sr v_Da f a j 


uae  Ina_Dut_Dafa,  In_Dut_Srv_Oafaj 
package  naaory_nodu I  a  is 


--  Function) 

Rapraaanta  tha  flaaory  acdula  that  hoida  to-ba-aant  datagraaa 
--  aa  ua I i  aa  initialization  paraaatara  naadad  by  INn_DUT. 

took  flaaory  ia 


--  Func lion) 

--  Raaponda  to  Raquaat  antry  call  to  aithar  racaiva  x-aizad  addraaa 
--  bytaa  or  aand  octata  of  infornation  froa  tha  aaaory  aoduia  to  uhich 

--  it  haa  accaaa.  Thia  taak  ia  a  pura  aarvar,  parforaing  a  aaaory 

--  function. 


entry  Raquaat ( 

raquaat_typa_foraai)  mamory_raquaat_typa| 

--  Load_addraaa  or  racaiva _datua_octat. 
chunlc_of_addraaa_fornai)  chunk  _of_addraaa_typa) 

--  Don't  cara  uhan  r aquaa t _t ypa_f or aa i 
--  r ac a i va_da t ua_o c t a t . 

0 c t a t _f or ja i )  out  oc ta t_typa ) ; 

—  Don’t  cara  uhan  i oad_addraaa. 


--  Function) 

—  Uhan  raquaa t_typa_f orma i  ia  r aca i va _da t um.oc t a t ,  thia  antry  coplae 
--  an  octat  of  inforaation  from  a  rsfarancad  location  in  ita 

accaaaibia  aaaory,  uritaa  it  into  tha  octat_foraai  paraaatar, 

—  and  than  incraaanta  that  rafaranca. 

Uhan  raquaa i_typa_foraa I  ia  i oad_addraaa ,  thia  antry 
--  "purauaa  cona truct i on*  rf  a  aaaory  addraaa  by  ‘taking  in* 
tha  x-aizad  chunk  of  b;>a  auppiiad  by  tha  firat  arguaant. 

—  Tha  vniuaa  input  for  tha  aacond  or  third  paraaatara  ara 
‘don’t  caraa*,  uhan  tha  firat  arguaant  ia,  r aapa c t i va i y , 

--  ra ca  i  va_da  t  ua_o c  t a  t  or  i  oad_adciraaa  . 

end  flaaory; 


end  naaory_nodu i a ; 


T6B 
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Rda-to-Si  I  icon  Projoct 
Univariity  of  Utahi 

OoO  Intornat  Protocol  INn_0UT  aubaoduia 

Rda  coda  for  tha  pactcaga  naaadt 
Pifo_  11  odule 

Vartion  of  Novaabar  1,  1982 


with  ln_0 u t _Srv_0a f a ,  Ina_0u t_0a f a j 
uae  In_0u t _Srv_Da f a ,  Ina_0u t _0a f a j 
package  F I f o_nodu i a  ia 
task  F  I  f  o  ia 
--  Function: 

--  Sarvar  taat;  only)  iaauaa  no  caiia. 
entry  F  I  f  o  _r  a  q  ( 

co'<aand_foraai  t  f  i  fo_coanand_typa{ 
o c t a t _f oraa 1 1  octat_typa)| 


--  Funt  t I  on : 

Thia  antry  accapta  tha  foi lowing  command  valuaat 
--  raaati  raaata  tha  FIFO 
--  atora;  atoraa  an  octat  in  tha  FIFO 

—  ratrlava:  ratrlavaa  an  octat  from  tha  FIFO 

end  F I f  o I 

end  F i f o_nodu la ; 
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r 


i 


i 


nda-1 o-S I  I  I  eon  Projici 
Un I viri I  ly  of  Utahi 

DoD  Intarnat  Protocol  INII.OUT  aubaodula 

Rda  eo.da  for  tha  auxiliary  pacKaga  naaixdi 
Local_  Net—  Uodule 

Varalon  of  Novaabtr  1,  1982 


with  lnii_0ul_0afa; 

UM  I  nn_0u  1  _0a  f  a  ; 
package  L  oca  I  _Na  l  _f1odu  I  a  ia 
task  Loca  l_Nat  is 
--  Func 1 1 oni 

--  Th  I  a  laalc  rapraaania  tha  local  not  aodula,  which  can  racalva 
--  and  ralurn  raaponaaa. 

entry  Oui_raq  (coBaiand_f  oraa  I  i  loca  l_nai_conaiand_typa) 

raaponaa_f oraa I  I  out  I  oca  I _na 1 _raaponaa _1 ypa ) ; 


—  Func  t I  on  I 

Thia  aniry  raclavaa  a  valua  of  conBand_foraa I  froa  tha  Inn_0u1  taak 
--  and  paaaaa  back  a  raaull  through  r a apon aa_f oraa I . 

Coaaand  valuaa  ara  currantly  llaltad  to  only  ona  valuai 
--  r a ca I va_f r agaan t . 

end  L  o  c  a  I  _N  a  t ; 

end  Loca I _Na t _nodu I  a ) 


i1 


o 


/ ! 
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Rda-t o-S I  i I  con  Projoct 
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DoD  Intarnat  Protocol  INn_DUT  tubaodula 

Rda  coda  for  tha  naln  tubaodula  packaga  naaadt 
Inm—  0  ut-  U  odule 

Vartion  of  Novaabar  1,  1982 


with  naaory_nodu  I  a  , 

Ina_0u  t _0a  f  a  , 
Ina_ln_0ut_0afi, 
In_0ut_Srv_0aft, 
Unchacl(ad_convartlon| 


package  Ina_Out_nodu I  a  ia 

—  Func  1 1  on  I 

This  package  contalni  talk  Ine_0ut  and  an  auxiliary  procedure  naaad 

—  Oo_iand.  Tha  taik  accept!  coaaandi  froa  tha  SERVER  aodu I  a  and  acti 
--  to  foruard  datagram  to  tha  LOCAL  NET  aodula. 

uae  naaory_nodu  I  a ,  Ina_Du t _Da f a ,  I na_In_Du t _Da f a ,  I n_Du t _Srv_0a f a ; 


Initancai  of  Unchackad_con var 1 1  on i 


function  Convert _tuoaoaa_array_to_racord 
ia 

new  Unchackad_convarB I  on ( 

Bourca  =>  octet _buffar_typa(8 
target  =>  t uo_o c t a t _r acord) ; 

function  Convart_tuoBoma_Crray_to_lnlagar 

is 

new  Unchackad_con  van  I  on  ( 

Bourca  =>  oc  t  a  t  _bu  f  f  ar_t  ypa  ( 8 
target  =>  bitl6)| 


—  Uiad  by  Raad_l nhaadar . 

..  1), 

—  Uiad  by  Ra ad_l n_ha adar . 

..  1), 

—  Uaad  by  Ra ad_l n_haadar . 


function  Convart_tuo_octat_racord_to_lntagar 

is 

new  Unchackad_convarB I  on ( 

Bourca  =>  tHO_octat_racord} 
target  =>  bltlB); 


—  Uiad 


In  Do_Band. 


function  Con vart_lntagar_to_tHO_octat_r acord 

—  Uiad  In  Do_Band. 


is 

new  Unchackad_convarB  I  on ( 

Bourca  =>bltl6, 

target  =>  t  ho_oc  t  a  t  .record )  ; 


function  Convart.arv.conaand.to.chunk.of.addraBB 

—  Uaad 


is 


new  Unchackad.convara I  on ( 

BOurca  =  >  arv.connand; 

target  =>  chunk.of.addraaa.typa)  ; 


by  various. 
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—  Rinanid  taak  antry 


procedure  Haaory.r jr  c  •  ( 

raquast  _typa_rj  nanory.raquaa  t_typa  j 

chunk.o  f  _addraan  )i'.  aali  chunlc_o  f  _ad  dr  aa  a_ty  pa  ; 

oc t a t_f oraa I i  out  octat_typa) 

rename*  Haaory . Raqua a t ; 

--  Eabaddad  taaki  tha  "aaln  ahou* 


tadc  Inn_0ut  is 

Thia  la  tha  principal  taak  of  INt1_0UT. 

~~  it  laauaa  calla  on  tha  Go  antry  of  Road_ I n I t _paraho t or  a  and  on 
Out_raq  antrlaa  In  HEnORY,  FIFO  and  L*n_0ut 
aa  uall  aa  Out_raaat  In  FIFO. 

entry  Srv_raq ( 

aarvar_co*Band_datuBi  arv.coBaand j 

raaponaa_to_Barvar i  out  ou t _raaponaa ) j 

—  Func  1 1  on  I 

ThIa  antry  racalvaa  coananda  froa  INfl.SRV  aodula  and 
--  paaaaa  back  raaulta  through  tha  paraaatar  raaponBa_t o_aarvar . 

end  Ina_0ut ; 

--  Eabaddad  taaki  an  ‘au'tlllarj  ahou* 
task  Raad_l n I t .paraaa tara  is 
entry  C  o  ( 

I n I t_nuB_f oraa I :  I n t agar _r anga  8  ..  7; 

raaponaa:  out  ou t _r aaponaa) ; 

--  Func  t I  on  I 

Gata  lnlt_nuB  addraaa  chunka  froa  INH_SRV  and  ahipa  thao  ovar  to 
tha  tha  aaaoclatad  Haaory  aodula,  foraing  tha  baaa  addraaa  of  tha 
—  atoraga*  block  containing  tha  Initialization  paraaatara;  than 
gata  tha  Initialization  paraaatara  froa  tha  ttaaory  aodula. 

--  Sata  out_raaponaa  to  althar  aand.ok  If  auccaaaful  or  to 
--  bad_arv_coaaand  If  una uccaaa f u 1 .  (Can  ba  unauccaaaful  If  raquirad 
too  tabla  alza  axcaada  availabla  local  apaca.) 

entry  Srv_rtq  ( 

earvar_coaaand_datuat  arv_coaaand( 

r aaponaa_t o.aar var I  out  ou t _raaponaa ) j 

--  Func  t I  on  I 

Thia  antry  racalvaa  coaaanda  froa  tha  INfl.SRV  aodula. 

Nota  that  taak  Ina.Out  has  an  IdantIcal  antry. 

end  Raad_l n I t_paraaa tara J 


--  Eabaddad  taaki  anothar  ‘auxiliary  ahou* 


task  Trana I ata_TOS_Taak  ia 
--  Func  t I  on  I 

Thia  pura  aarvar  taak  axacutaa  concurrant  I  ij  ulth  Ina_0ut  uhan 
--  parforaing  a  raquaatad  lookup  In  a  globally  accaaalbla  t ypa_o f _aarv I ea 

—  tranalatlon  tabla  to  dataralna,  yaa  or  naa,  uhathar  thara  Is  a 

—  local-nat  typa-of-aarv I ca  corr aapond I ng  to  tha  givan  t ypa-o f-aar v I ca . 

—  If  yaa,  tha  aatchad  local  n*t  toa  valua  la  Indicated  In  tha  fora  of 

--  a  raturnad  Index  Into  tha  toa_tabla.  Sand_f ragman t  ulll  than  uaa 

--  thia  valua  later  to  flah  out  tha  local  nat  toa  valua  to  ahip  to  tha 

--  F  I  f 0  Bodu  I  a . 
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entry  Big  I n_tp#ni I # 1 1  on ( 

I ni_t oe_by ta s  bltS); 

—  Function! 

_  Thli  intpy  accipti  th#  pattid  (tpom  INn_SRV)  TOS  byta. 

__  Thi  randazvoua  li  l■■adlattly  bpoltin  to  pirnlt  thi  calling  taok 
— —  to  riiuBi  c oBp u t • t  I  on  -  In  thi  "ititinint  iiquil"  for  thli  intry^i 

_  accept  ititinant,  thi  iirvir  talk  pipfopBi  thi  piqulpid  lookup. 

__  For  a  lucciiiful  coBplatlon  of  thi  iiirch,  thi 

■  ucciiifu l_trini lit  Ion  flag  li  lit  to  trui,  othirulii  thi  flag 
—  Ii  lit  to  falio. 

entry  Sind_riiult( 

iucciiiful_tranilat  ions  out  boo  1 1 a n ; 
toi_lndixs  out  Intagar 

range  1  ••  Bax_toi_tib I i_i I zi) ; 


—  Func  t I  on  t 

Sindi  back  thi  riault  of  tha  iBBaala’ily  pricidlng  B ig I n_t r ana  I  a t  I  on 
■  ntry  call.  If  luocaii f u I _t r am  I # 1 1  on  li  trui,  than  toi_lndix 
rifarancai  thi  toi_tabla  aliBint  containing  thi  corn iipond I ng 
—  local  nit  toi  valui. 

end  Tram  I  a  t  i_T0S_Taik  ; 


—  Varlabli  daclaratloms 


I ai t_raiu 1 1  s 

tlBa_out_ln_Bl I  I  iiicondit 


out_riiponii  s=  lant.ok; 

intagir  range  1  ..  2aaI6  -  1; 
--  CoBputab  li  froB 
--  lnB_tlBa_out  (lai  bilow) 
--  in  prociduri 
--  Ri ad_l n I t_paraBi t ir a . 

--  Rctually  wi  Bay  not 
--  coBputi  It  aftir  all. 


I oc# l_nat_toi_l ndax I  Intigar  range  1  ..  B#x_toi_tab I a_i I zi j 

~  ~  --  Valua  ricilvid  froB  call 

--  froB  Riad_l n_h#adir  on 
--  Tr  am  I  a  t#_T0S_t aik  . 

_ Varlablii  to  hold  Initialization  paraBitir  valuiit 

Inm  B#x_pack#ts  t ho_oc t i t _r ac ord , 

—  Lirgiit  ilzi  pickit 

—  for  thi  local  mt. 

--  Ripraiintid  ai  a  pair  of 
--  octati  and  alio  uaad 

—  at  a  16-blt  Intagir  aftar 

—  applying  Unchickad_ 

--  convaralon. 

InB  addriBB_l ing t h t  octit_typij 

--  Uiid  In  Riad_ln_haadir. 

lnB_t lBa_out!  tHO_octit_ricord| 

--  Halting  t I Bi  at  LN. 

--  Ripriaintid  ai  a  pair  of 
--  oetata  and  alio  uaad 
--  at  a  16-blt  Intagar  aftar 
--  applying  Unchickad. 

--  con vir 1 1  on . 


ack_typis  octit.typij 

--  Ear  ly/  lati . 

local _n at _typa_of_iirvlci_tabla_roH_#lzi!  octit.typa; 
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nu«ib«r_of_local  _n«  t  _t  y  pt«_o  f  _««r  v  I  ca  i  oc  ta  t  _t  ypa  | 


dont _cara_oc  t a  t  < 


octat_typa; 

_ Uaad  at  an  actual  paranatar  for  flaaory .  Raquat  t  antry 

_ cal  la  Hhan  addrati  chunks  ara  baing  aovad  to  tha 

--  aanory  aodula. 


--  Rrrayi 

tos.tabla:  oc t a t _bu f f a r _t y pa (8 


■  ax_t ot_t ab  1  a_s  1  za  -  Dj 

—  Tha  a Iza  of  this  tabla 
--  dapandt  on  tha  ttoraga 
--  tpaca  avallabla  In  tha 


—  n  I  tea  I  lanaout  conttantti 


dont  cara_X_datu|ni  coMtnnt  chunk_o f _a ddra aa_t y pa  :=  8| 

Utad  at  an  actual  paranatar  for  Hanory . Raquatt  antry 
--  calls  Hhan  no  addratt  chunks  ara  actually  novad. 

_  Harduara  Inplanantar  nay  uta  I nda t a rn I na t a  valua. 


end  Inn_Out_nodula; 


package  body  Inn_0ut_l1odu  I  a  is 
procedure  Do_tand 
--  Fund  I  on: 

--  This  procadura  sands  an  Intarnat  datagran  In  tha  folloulng  stapsi 

1)  Raadt  tha  Intarnat  haadar. 

2)  Translatas  Intarnat  TOS  byta  to  a  local  na t  TOS. 

3)  Constructs  fragnants  and  sands  than  to  tha  local  nat. 

Tha  option  list  for  all  but  tha  first  fragnant  ara 
conpactad,  and  tha  chacksun  for  aach  fragnant  It  conputad. 

Any  ancountarad  arror  tarninatas  transnisslon  of  tha  datagran 

_  ulth  an  appropriata  valua  attignad  to  tha  (global)  varlabla,  nanad 

I  ast.rasu I t . 

is  separate; 


fwiTc  body  Inn_0ut 
is  separate; 


-task  body  Trans  I  a  ta_TOS_task 
is  separate; 

task  Raad.lnl t_paranatars 
is  separate; 


end  Inn_0ut_F1odula| 
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Rda- 1 o-S i I  I  con  Projact 
Un I  vara  I ty  of  Utahi 

DoD  Intarnat  Protocol  INn_0UT  aubaodula 
Rda  coda  for  tha  body  of  tha  principal  taak  naaadt 
Inm_  Out 


Varalon  of  Novaabar  1,  1982 
■eparate  ( I na_0u t_nodu I  a) 
taak  body  Ina_0ut 


Func  1 1  on  t 

Thla  Is  tha  principal  taslc  of  INt1_0UT. 

It  laauaa  calls  on  tha  Go  antry  of  Raad_l n I t_par aaa t ar a  and  on 
Out_raq  antrlaa  In  flEtlORY,  FIFO  and  Lan.Out 
aa  uall  aa  Out_raaat  In  FIFO. 


icoaaandt  arv_connand; 

lnlt_nua:  Intagar  ran^e  6  . .  7 ; 

dont_cara_oetat I  octat_typa; 


--  Uaad  aa  a  duaay. 

—  HardHara  I np I aaan t ora 

—  usa  an  I nda t a ra I na t a 
--  va I ua  . 


begin 


--  flaln  roaaand  loop 
loop 


--  Cat  naxt  coanand  froa  tha  sarvar. 
accept  Srv_raq  ( 

aarvtr_coanand_datuBi  arv_coanand; 

r aaponaa_t o_aar var I  out  out_raaponaa) 

do 

Icoamand  i=  aarvar_coaaand..datua; 

if  Icoamand  =  tast  then  --  Raport  laat  rasult. 

raaponaa_to_sarvar  i=  laat_rasult; 
end  if) 

end  Srv_raqj  —  Braak  randazvous. 


--  Nou  hondia  non-taat  arv_coaaands . 
case  I coaaand  im 

when  lnlt_.l  I  lnlt_2  I  I  n  1 1_3  I  I  n  I  t_4  => 

caae  Icomiiand  is 
when  In  I  t_l  =  > 

I n 1 1 _nun  i  -  1 j 
when  In  I  t„2  =  > 

I n I t_nua  t  =  2; 
when  I  n  1 1  _3  =  > 

I n I t_nua  :  =  3 ; 
when  In  1 1_4  =  > 

I n I t_num  t  =  4 ; 
when  others  -  > 
null] 
end  case; 


—  Start  up  task  Raad_l n I t_p araaa t ar s . 

Raad_lnit_paraastara.Co( 

I  n  I  t_num_f  or  aa  I  =>  lnlt_nua, 
raaponsa  =>  I  aa  t  _ra  su  I  t )  j 

—  End  of  Init  coanand  procaaslng.  If  unsuccasa f u  I ,  tha  raaponaa 
--  to  tha  SRV  modulo  uili  ba  btd_arv_command . 
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when  •  and  =>  __  (;•(  and  put  (aova)  all  but 

—  addr_chunlc  for  tha  actdraaa 
--  of  tha  datagran  froa  tha 
—  SRV  to  tha  Ftaaory  aodula. 

--  Notat  In  tha  folloulng  tuo  loopa  ua  hava  a  glitch  In  that  wa 
ara  Matching  a  larvar.coaaand.datuM  to 
typa  chunlc_o  f  .addraaa  _f  orna  I .  Looka  Ilka  ua  naad  to 
—  apply  Unchack ad_convara 1 0 n .  Thia  problaa  alao  arlaaa 

In  aarllar  varalona  of  thIa  taak, 

for  Indax  in  1  . .  lnlt_nuM  -  1 
loop 

accept  Srv_raq( 

aarvar_coaHand_datuHi  ar  v_connand ; 

raaponaa_to_aarvart  out  out  raaponaa) 
do 

Icoamand  t=  aarvar_co»Mand_da tuu; 
nauory.raquat  t  ( 

paquaat_typa_f opMa I  i  >  I u ad_a ddpa aa , 

c hunk _o f _addpa t t_f opMa I  r:  >  tapvap_coMnand_da tuu, 
octat_foraal  =>  don  t  _c  apa_o  c  t  a  t )  ; 

end  Spv_paq; 
end  loop; 

—  Laat  addp_chunk  of  datagpan  addraaa  la  a  apaclal  caaa,  dapanding 
--  on  ack_typa  In  affact. 

accept  Spv_paq  ( 

sapvar_connand_datuni  apv_coiiinand; 

pa apo naa _ t o_a ap va p I  out  o u t _r a sponi a ) 

do 

nanopy_paquaat( 

paquaa  t  _typa_f  oraa  I  =>  I  o  ad.addp  aaa  , 

chunk_o  f  _addratt_f  orna  I  =>  aar  vap_coHmand_da  t  un , 
octat_foPHal  =>dont_capa_octat); 

--  Lata_ack  cata,  uhjra  apv  It  hald  up  till  In  conaunat  datagpau. 
if  ack_typa  =  lata_ack  then 

tio— --  Do  aM  panalning  procaaaing  fop 

—  landing  thia  dataqpaa. 

end  if; 

end  Spv_Paq; 

Nou  aap|y_ack  cata,  iihapa  apv 
if  ack.tyoa  =■  aap|y_ack  then 
Do_a  and ; 

end  if; 

when  others  =  > 

latt_patult  «=  bad_tPv_co»)Hand; 
end  case; 
end  loop; 

end  Ins.Dut ;  —  and  of  task  body 


la  not  hald  up . 

—  Do  all  rasalnlng  ppocaaaing  fop 

—  aanding  thia  datagpan. 
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_  fida-to-SI I  Icon  Projoct 

__  Univorsity  of  Utah: 

DoD  Intarnat  Protocol  INn_0UT  aubmodula 
nda  coda  for  tha  body  of  tha  auxiliary  task  naaadt 
ZI  Read- Init- Parameters  <usad  by  InB_0ut) 

__  Varslon  of  Novasbar  1,  1982 

separate  (Inin_Out_Hodu  I  a) 


task  body  Raad_In I t-Paraaa tars  is 


--  Rccassad  globalst 

— -  n  uab  ar — 0  f _ I o  ca  I  ^n  at— typas— of— sarvica: 

_  local -nat-ty pa _of_sarvica -tab Is-rou-slza: 

--  tos-tabla: 


oc  ta  t-typa 
oc  ta  t-t  ypa 
octat-buf f ar-typa 


—  Ranaaad  task  antryt 

_ Tha  packaga  Ha  nor y-Hod u I  a  containing  tha  task  Hanory  holds 

--  to-ba-sant  datagrams  as  ua i  I  as  Initialization  paraaatars 
--  naadad  by  INII— OUT. 

procedure  Hanory— raquast  ( 
raquast-typs-fornal  : 

chunk-of-addracs-fornal: 

oc  t  a  t-f  orna I : 
renames  hanory. Raquast; 


«.anory-raquast-typa; 

--  Load-Sddrass  or  r aca I vs-da tuB-OC ta t . 
chunk-0  f-addrass-typa; 

_  Don’t  cara  uhan  raquas t-typa-f oroa I 

--  r aca I va— d a t UB— oc t a t . 

out  octat-typa) 

_  Don’t  cara  uhan  I oad-addrass . 


--  Local  variable  daclaratijn: 

_  Tha  following  variable  Is  coBBantad  out.  It  appeared  only  In  tha 

--  •hlfth-laval’*  used  to  read  In  tha  TOS  table.  Saa  below. 

--  nu  bar_of-tos-tabla-OCtats:  Integer  range  2  ..  nax-tos-tab I a-S I ze  -  1; 
oc t a t— r eg  I s tar :  octat-typa; 


begin 

loop 

accept  Go  ( 

I n I t_nuB-forBa I :  blt4; 


rasponsa: 

do 

rasponsa  : = 


out  ou t-rasponsa) 
sen  t_ok ; 


—  For  Carter’s  paper 

—  only;  otharwlaa  blt3 


--  RIso  naans  Init-Ok. 


--  Gat  fron  tha  server  all  of  tha  addr-Chunks  naadad  to  forn  tha  base 
--  address  In  BSBory  that  holds  tha  Initialization  paranatars  and 
--  sands  these  chunks  to  tha  Hanory  Bodula. 
for  Index  in  1  .  .  I n I t -nun_f o r na I 

^TcceptSrv-raqf  -  Cat  next  addrans 

--  chunk  froB  tha 
--  Server  hodula. 


server -COBBand-datuB:  srv_coBBand ; 

r aspons B- 1 0  — sor var :  out  out-rasponsa) 

do 

Hanory-raquas  t  ( 


Put  chunk  out  to  the 
hanory  nodula. 
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r  •qu*  •  t_t  y  pt_f  or  I  I  o«d_«ddi  , 

chun»:_of_addpo««_f  or««  I  =>  , 

Convopt_«pv_coBi»«nd_to_chun»:_of_*ddPO«« 
(■■rvor_coaaand_datua) , 

octat_fopBal  =  >  dont_cara_octat) ; 

end  Spv_paq; 
end  loop; 

_  Cat  tha  6  Individual  Initialization  paraaatarB  (contalnad  In  tha 

_ naxt  8  octPta  ptcalvad)  fpoit  tha  naaory  lloduia. 

for  I  n  d  a  z  in  1  •  •  8 
loop 

t1oBOPy_raqua*t  ( 

poquoat_typa_f  oraa  I  =>  paca  I  vo_da  t  ub_oc  t  a  t , 

chunlc_o  f_addpaaa_f  op*a  I  =>  don  t_cara_X_datuB, 
octatifopaal  =  >  oc t a t _pa g I  a t or )  ; 


case  Indaz  is 
when  1  =  > 
when  2  ■-> 
when  3  =  > 
when  4  =  > 
when  5  =  > 
when  6  =  > 
when  7  =  > 

when  8  =  > 


I  na_aaz_pacl(at .  lo 
I  na_aaz_pacl(at  .h  I 
lna_addraBB_iangth 
I  nB_t  I  ao_i)u  t .  I  o 
I na_t iBa_out .h I 
acl(_typa 

local _no  t _typa_of _Bapv lca_ 

I 


1=  octat_paglat8p; 
1=  octat_paglBtar; 
1=  or.  tat_paglBtap; 
1=  octat_paglBtBP; 
1=  octat_paglBtar; 
i  =  octat_rag latar; 
tab  I a_rou_B  I  za 
=  octat_pag I B tar  ; 


nuBbap_of _loca I _na  t _ty poo  .of _Bapv  I  ca 

I.:  0Ct8t_P8glBt8P| 


end  case; 
end  loop; 


--  Convert  tha  local  nat  tiaaout  Into  a  I  I  I  I sacondo .  7 
--  t I aa  out  In  allllsaconds  i=  lna_tlaa_out  /  1888.8; 

~  ”  __  Laft-hand  sida  variable  declared 

_  In  InB_Out_nodu la.  Value  is  used 

--  latar  in  Do.sand  procadura. 

--  Note:  Davis  navar  did  thia  In 

_  his  dasign.  la  this  step  naadad? 

_  Nol  Ua  don’t  naad  this  atap 

--  sinca  tha  quotlant  can  ba 
--  approzlaatad  by  a  div  by  2i'>al8 
--  In  tha  avant  ua  naad  to 
— -  pappoBont  allllsaconds. 


--  Read  In  typs  of  sarvlca  translation  tabla. 

_  Tha  follouing  code  In  coaaants  Is  raplacad  balou  by  a 

■  louar-lava I ■  varslon  that  closely  rafiacta  tha  harduara 

_  I  ap I aaan t a t I  on  chosan  In  uhlch  ua  ailalnata  tha  naad  for 

--  for  a  Bui t Ip  I iar. 

nuabar  of  tos  tabla_octats  i=  I  oca  I _na t _typs_o f _s ar v I ca_t ab I a_r OH_a I za 

•  nuBbar_of_loca l_nat_typas_of_sarvlca; 


_  Check  to  saa  If  raquirad  table  size  exceeds  Baxlaua 

if  nuBber_o f _tos_t*b I e_oc t e t s  >  Bax_tos_tab I e_s I ze  man 
rasponaa  i-  bad_B r v_c oaaand ; 
r a  t  ur n ; 
and  if; 

for  Indax  In  1  ..  nuBber_o f _t os_t a b I •_oc t a t s 
I  oop 

l1oBory_requast  ( 

raquast_typa_forBa  I  =>  r aca  I  va_da tuB_oc t  a  t , 

chunk  of  addrasB_f  oraa  I  =>  don t  _car a_X_da tua , 
octatlforaal  =>  t  os_t  ab  I  a  ( I  ndax )  )  ; 

and  loop; 
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declare 

;;  . . . 

loc* l_n«t_typ»_of_B«rvlc«_t*b l•_POH_g  I*,, 

•"‘*•>‘1  intagar  range  8 

..  nu»bar_of_iocal_nat_tupaa_of  aarvica 
• Joca l_nat_typa_of_aarv I ca_tab la  row  alza 

begin 

roH_nunbar  i  =  8} 

^“col.numbar  .=  8,  "  *"  »»  TOS  tabla. 

^‘’n^aory_raqua,t(  TOS  tabla. 

raquaat  typa_for*al  =>  racalva  datum  octat 

^  ♦ot_t*bl«(|nd«x))| 

c:!_nunbar  i=  col_nunbar  +  Xj 

exit  when  col _numbar  =  '  oca  I  _na t  _t  ypa_o  f _,ar v I ca_ t ab I a_roH_a I za , 

I ndax  :  =  Indax  +  1 ; 
if  Indax  >  *iax_toa_tab  I  a_a  I  za  then 
ratponia  :  ::  bad_tpv_conMand} 

end'll’  “‘"“''•o*  occapt  atatamant. 

End  Innar  loop, 

roH_nuMbar  i=  roM_nunbar  +  !• 

enrLpf”  -V-'7*'-'’**-*«P«-«>»-«Tvlca, 

ejij.  ~~  outar  loop. 

’  —  End  dac  I ara  b  lock . 


end  G  o ; 
end  loop; 

end  Raad_In I t.Parzmatara; 


--  End  of  Init  procatalnq. 

—  End  of  outar-Most  (Inifinita) 

—  loop. 
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__  flda-t o-S I  I  I  con  Projact 

__  UnIvarsIty  of  Utah) 

__  DoO  Intarnat  Protocol  INH.OUT  aubaodula 

flda  coda  for  tha  body  of  tha  auxiliary  task  naaadi 

__  Translate- TO S_  Ta«k  (usad  by  Raad_l  n_haadar  > 

--  Varslon  of  Novanbar  1,  1982 

separatee  InB_0ut_nodula> 

task  body  Trans  lata— T0S_Task 

is 


—  Local  variable  do c  I  nr  a 1 1 o ns i 

Indaxi  Intagar  range  6  ..  sax.t  os_t  ab  I  a  _s  i  i-a  -  Ij 

I oc a  I -t os_by t a  I  bItS; 
succassi  boolaan; 


begin 

loop 

accept  Bagln_translatlon(lnB_tos_bytai  bItS) 
do 

loca  l_tos_byta  i=  I  nB_t os_by t a  ; 
end  Bag  I n_tranB I  at lonj  --  Break  randakvous. 


--  Search  for  tha  INH_T0S  byte  In  tha  TOS  translation  table, 
success  1=  falsa;  —  Inltlallia  for  search. 

I ndax  I  =  6  ; 

declare  ,  , 

roM  nunbtri  lnt*g*r  range  B  nuiib«r_o  f^loc*  -1 

1=  8; 

--  Tha  value  of 

--  nuBb ar_o  f _l 0 ca I _na t _t y pas_o f _ 

--  service  Is  dynamically  dafinad 
--  In  previous  action  of  tha 
—  Raad_In I t-ParaBatars  task. 

begin 

while  roH.nuBbar  <  nuBbar  _o  f  _  I  oca  I  _na  t  _t yp  as_o  f  _sar v  I  ca 

loop 

--  Test  for  tha  I  oca  I -tos_by ta  In  tha  TOS  translation  table, 
if  tos_tabla(lndax)  =  loca l_tos_byta  then 

IndaiT  !=  Index  +  1;  --  Index  nou  points  at 

--  I  oca  I  na  t  tos . 


success  1=  t  rue  I 
exit; 
else 

Index  1=  Index  +  I  oca  I -na t -t yp a-O f -Sar v I ca_t ab I a_roM_s I z a ; 

end  if; 

rou_nuBbar  i=  rou_nuBbar  +  1; 

end  loop; 

gQg,  End  of  declare  b  I  ock  . 

--  End  of  saqual  for  preceding  accept  stataBant. 


accept  Sand_raBult( 

success f u I -trans I  .tioni 
tos_l ndaxi 


do 


success  f  u  I  _t rans  I  a 1 1  on  i  = 
tos_lndax  i = 


out  boolean; 
out  Intagar 

range  1  ••  Bax_tos_tabl  a_s I za) 

--  tos-Indax  value  Is  sent  to  tha 
—  global  naead  *  I  oca  I _nau_t os_l ndax* 
--  for  usa  by  Sand_f r agBan t . 


success ; 
Index; 


G06 


Ada  Specifications  for  the  Dod  hitemetProtocok 

The  OUT  Submodule  RepoitNo.  1  page  38 

end  Send_rasu ! t  j 
end  loop I 

end  Tran* I  a t t.TOS.Task I 
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--  flda- 1 o-S I  I  I  con  Projtct 

—  Univorolty  of  Utah: 

--  DoD  Intarnat  Protocol  INn_DUT  aubaodula 

Rda  coda  for  Iht  body  of  tha  procadurai 

Do- send 

--  Varalon  of  Novanbar  1,  1982 

with  Inn_In_Dut_Da f a ,  InH_Du t _Da f a | 
uae  InH_In_Dut_Da f a ,  I nH_Du t _Da f a ; 

■eparatcf  Inn_Du  t  .Hod  u  I  a  > 


procedure  Do.aand  is 

—  Function: 

--  This  procadura  aanda  an  Intarnat  datagraa  In  tha  follouing  atapii 
1)  Cats  tha  Intarnat  haadar  fron  naaory.flodu  I  a . 

—  2)  Dataralnaa  by  antry  calla  to  Translata_TDS_Task  If  tha 

--  tha  Intarnat  TDS  byta  corraaponda  to  a  valid  local  nat  TDS. 

--  3)  Conatructa  fragnanta  and  aanda  thaa  to  tha  local  nat. 

—  Tha  option  Mat  for  all  but  tha  firat  fragaant  ara 

--  coapactad  and  tha  chackaua  for  aach  fragaant  la  coaputad. 

--  Any  ancountarad  arror  taralnataa  tranaaiaalon  of  tha  datagraa 
--  u I th  an  appropriata  (axp I anatory >  valua  aasignad  to  tha  (global) 
varlabla,  naaad  lait_rasult,  daclarad  In  tha  I na_Du t _nodu  I  a . 

is 

—  flccaaaad  globalai 


--  unauppor tad_toa  I  ou t_raaponaa  ; 

—  bad_haaderi  ou t  _raiponia | 

—  dont_fragaant_arror:  out_rasponaa | 


--  Subtypa  daclaratlon: 

aax_lna_addraaa_a  Iza:  constimt  t  =  2;  --  Siza  In  octata. 

subtype  I na_addra aa_bu f f ar.typa  is 

octat_buffar_typa(8.. aax_  I  na_addr aaa_a I za-1) | 

—  Daclaratlona  of  local  varlablaat 


lna_addraaa_buffart  lna_addraaa_buffar_typa| 

haadar _buffart  haadar_DUffar_typa| 

--  Haadar  racord. 


haadar _octat_arrayt 


Naad  to  Inaart  hara 
haadar _octat .array. 


haadar_octat_buffar_typa| 

--  Dctat  array  uaad  to  atora  haadar. 

--  In  a  harduara  I  I  sp I asan t a t I  on ,  thia 
--  array  can  ba  tha  aasa  aa  the 
--  haadar.buf  far. 

addraaa  clauaaa  for  both  haadar _bu f f ar  and 


haadar. I angthi  haadar.langth.typa; 

--  Haadar  alza  In  octata. 

aagsan t .1 ang th I  Intagar  ranxe  aagnant.lou.addraaa  .. 

aagsan t _h I gh.addr asa I 
--  Langth  of  aagsant  part  of  datagraa 
--  In  octata. 


F 


Ada  SpeciHcatioiia  forthe  Dod  bitenietProtacak 

The  IN11_  OUT  Submodule  Report  No.  1 


page  40 


good_ht«dtr_rtsu I t I  booitani 

olc_tos_transi«t  ioni  booiaani 

olc_fragBant_tranamlisloni  booiaani 
sacond_fragsant I  booiaani 

Mora.fragBantsi  booiaani 

f ragaan t _i ang t h I  intagsr  range 

cur ran t _f ra gman t _o { f aa t I  intagar  range  fl 
fragnant_ssg«iant_iangthi  intagar  range 


-  Rssu 1 t 

0  f 

t  ha 

raad_ln_haadar  call. 

-  Ra  su 1 t 

0  f 

tha 

tos_trans iat ion. 

-  Rasu  1 1 

0  f 

tha 

Sand_f ragaant  call. 

-  R  flag 

that  Indicatas  If  ths 

currant  fragnant  is  tha  aacond 
fragtiant  of  tha  currant  datagraa, 
n  fiag  that  indicatas  if  thara  ara 

-  Bors  fragBants  to  ba  forbad. 

21  .. 

Convart_tuo_octat_racord_to_intagar 

(  inB_Bax_pacicat)  | 

Usad  to  indi.'.ata  ths  currant 
fragaant's  If.ngth. 

..  2  as  16  -  1; 

Indicatas  tha  currant  fragBsnt's 
offsat  into  tha  ovaraii  data 

-  sagaant. 

1  .. 

Convart_tuo_octat_racord_to_lntagar 
( i  nB_Bax_pacica  t)  -  20| 

Usad  to  indicata  tha  langth  of  tha 
currant  fragaant’s  data  part. 


da t agraB_t o ta i _ i ang t h I  intsgsr  range  21  ..  2  aa  16  -  1| 

—  Usad  to  sava  tha  totai  langth  of 
--  ths  currant  datagraa. 


chacksuai  tuo_oc ta t_racord  | 

chacl(sua_ulth_optionsi  tuo_octat_racord| 

--  Chscksua  valuas  ara  dsvaiopad 
--  in  thasa  auxiliary  varlablas  and 
--  latar  Insartad  Into  tha 

hsadsr _bu f f a r  prior  to  copying 
--  tha  haadar  to  ths  F I fo  aodula. 


—  Constantsi 


fragaant_bl t_truai  constant  I n t a g a r  i=  1| 

--  Usad  to  sat  tha  aora_f ragaants  bit 
—  in  haadar.bu f f ar . f I ags. 
do_not_fragaant_trua I  constant  i n t a g a r  i=  2| 

--  Usad  to  tast  If  tha  flags  flald 
--  indicatas  that  no  fragaantat I  on 
--  I s  to  occur . 

—  Local  procsduras  and  functions) 


procedure  Rasd_ln_hasdar  ( 

good_hasdari  outboolaan) 

—  Function) 

This  procadura  first  raads  In  ths  local  nst  addrass  of  ths 
--  tha  datagraa  Into  a  local  nst  addrass  buffar  and  than  raads  In  tha 

--  datagraa  haadar  octst  by  oetat  Into  a  haadar  buffar.  Upon 

succassfully  coaplating  tha  transfar  of  tha  haadsr,  ths  flag 
good_hasdar  Is  sat  to  trusj  otharulsa  It  is  sat  to  falsa, 
is  separate 


procedure  Coapsc t _op t I ons 
--  Func  1 1  on) 

--  This  procadura  Is  invokad  uhan  constructing  tha  sacond  fragaant. 

Tha  procadura  coapscts  ths  list  of  options  In  ths  haadar  by  kssping 
onfy  thoss  options  that  ara  fisggad  to  ba  copiad.  haadar 

--  langth  and  totai  langth  ara  also  updatad. 
is  separstei 


prxKsedure  Sand_frsgaan  t  ( 
dats_frsgaant_siza) 


bl tl6; 
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■  ucctiiful_fragKtnt_transnl«iloni  out  booltani 
axplanatloni  out  ou t _r ai poni a ) 

--  Funrtloni 

--  Ttia  procadura  putt  Into  tha  local  nat  FIFO  tha  follouing  - 
—  1)  local  nat  addraaa  -  local  nat  addraaa  for  tha  currant  fragaant 

--  2)  local  nat  TOS  -  local  nat  TOS  for  tha  currant  .fragaant 

--  3)  fragaant  haadar 

--  4)  fragaant  data  -  uhlch  It  pulled  out  byta  by  byta  froa  tha  Itaaory 

aetoclatad  ulth  tha  INII.SRV  aodula.  Tha  tiza  of 
tha  data  fragaant  It  pattad  at  a  paraaatar  to  thit  procadura. 

Thit  procadura,  aftar  ttuffing  tha  FIFO,  ulll  do  a  tiaad  antry  call 
on  tha  local  nat  (tha  call  autt  ba  coaplatad  In  tha  t I aa  tpacif lad 

--  by  a  paraaatar  pattad  doun  froa  INtl.SRV).  Upon  tuccattful 

trantalttlon  of  tha  contanta  of  tha  FIFO  to  tha  local  nat,  tha 
tuccattfu l_fragBant_trantaltt Ion  flag  ulll  ba  tat  to  truaj  otharulta 
It  It  tat  to  falta.  Tha  valua  attignad  to  *axp lanaat Ion*  conflrat 
tha  tuccatt  (tant_ok)  or  providat  tha  raaton  for  fallura. 

is  sepuraie; 

function  ninlaua( 

f I r t t _opara nd I  Intagari 

tacond_oparands  Intagar) 

return  Intagar 

is 

--  Func  t I oni 

Thit  function  takat  2  oparandt  and  raturnt  tha  ainlaua  of  tha 
--  oparandt. 
begin 

if  f I rt t_oparand  >  tacond_oparand  then 
return  tacond_oparand | 
else 

return  flrtt_oparand| 
end  if| 
end  n I n I buB| 

- - Body  of  Oo_tand  bagint  hara.  - - - - - - 

begin 

Raad_l  n_haadkr  ( good_haa  dar  =>  good_haadar_ratu  I  t )  | 

if  not  good_haadar_ratu 1 1  then 
latt_ratult  i~  bad_haadar| 

retumi 
end  if| 

if  not  (Con var t_t uo_o c t a t _r acord.t o_l n t a ga r ( 

haadar.bu f f ar . t o ta I _l ang th)  > 

Convart_tua_cctat_racord_to_lntagar( 

Ina.aax.packa t)  )  then 

-  Bagin  *tlngla  packet*  cata. 

—  Trantfar  chacktua.u I th_opt I ont,  uhota  valua  uat  coaputad 
--  by  Raad_ln_haadar,  Into  tha  propar  t!ot  In  tha  haadar_bu f far . 
haadar_buf  far.haadar.chacktua  1=  chack  t  uk.ii  I  t  h_op  t  I  o  nt  | 

Sand_f ragaant ( 

da  ta_fragaan  t_t  I  za  =>  t  agaan  t  _l  ang  t  h , 

aucca  at  f  u  I  _f  ragaan  t  _t  rants  let  I  on  =>  ok.fragaan  t.tranta  I  tt  I  on, 
axplanatlon  =>  I  at  t_ratu  I  t )  | 

return) 
end  if) 


End  *tlngla  packat*  cata. 
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- - Bagin  ‘nultlpla  packat’  (tuo  or  Bora  fragBanta)  casa. 

--  Fragaant  tha  datagraa. 

if  haadar_bu  f  f ar  .  f  I  ags  =  do_no  t_.f  ragaant.trua  then 
lait_raault  i=  dont _f ragaan t_arror j 
return; 
end  if) 

--  Initlallaa  f ragaantat  Ion  varlablai. 
curran t  _f  ragnan  t _o f  f la  t  i=  8; 

■acond_f ragaan t  i=  falia; 

aora_fr  agaan  t  a  i=  trua; 

ok_fragaant_tranaBlaa Ion  i=  trua; 

da tagraa_t 0 t a  I _l ang t h  i=  Convar t _tHO_o c t a t _racor d_t o_ I n t ag ar 

(haadar_buffar. tota l_langth>  ; 

--  Back  out  octal  containing  old  flaga  froa  tha  chackauB. 
cha ck aua_u I t h_o p 1 1 ona .  I  0  i=  chack aua_H 1 1 h_op 1 1 ona .  I o 

zor  haadar_oc t a t _arr ay (6 >  ; 

Sat  Bora  fragnanta  flag  In  haadar_buf f ar . 
haadar_bu f f ar . f  I  aga  i=  f ragaant.b I t.trua ; 

--  Updata  chackauB  with  octat  containing  nau  flaga  valua. 
chack aua_u I t h_op t 1 0 na .  I  0  :=  c hack aum_H 1 1 h_op t I ona .  I o 

zor  haa dar_oc ta t _ar ray  (6 >  ; 

while  aora_f  ragaant  a  and  ok.fragaant.tranaalaa  Ion 
loop 

if  aacond_fragaant  then 
Coapac  t  _op  t Iona; 
aacond_f ragaant  i=  falaa; 
end  if; 

f  ragman  t  _  I  ang  t  h  :=  tllnlauaC 

f  I  ra  t_oparand  =>  da  t  a  graa_t  o  t  a  I  _  I  a  ng  t  h 
aacond_oparand  => 

Convart_tuo_octat_buffar_to_lntagar 
( I na_Bax_pa ck a t >  >; 

f ragnan t.aagaan t_ I ang t h  f ragBant_lang th  -  haadar_ I ang t h  ; 

—  Inaart  nau  total  langth  Into  tha  haadar  and  updata  chackaua. 

--  FIrat  back  out  octata  containing  total_langth  froa  tha  chackaua. 
chack aua_u I t h_op t I ona  i=  c hack auB_u i th_op t I ona 

zor  haadar.buf f ar . 1 0 1 a  I _l an g t h ; 

haadar _buffar.total_langth  i  =  Convart_lntagar_to_tuo_octat_racord 

(fragnant_langth>; 

--  Nou  updata  chackaua  ulth  octata  containng  nau  t o t a  I _ I ang t h  . 
chack aua_H I th_op t I ona  i=  chack aun_H I th.op t I ona 

zor  haadar _buffar.total_langth) 

--  Taat  to  aaa  If  ua  ara  aanding  out  tha  laat  fragaant. 
if  currant_fragrant_of f aat  +  f ragman t _aag aan t_l ang th  = 

Sk.^aan  t  _  I  ang  t  h  then 

--  I f  a  <  condition,  than  ua 
--  than  ua  atlll  hava  anothar 
--  fragaant  to  tranafar. 

--  Ua  ahould  not  gat  a  >  valua 
—  bccauaa  tha  laat  fragaant  la 
--  coaputad  to  contain  tha 
--  tha  raaalning  octata  of  tha 
~  data  aagaant. 

--  Claar  aora  fragaanta  bit  and  adjuat  chackaua  aa  ua I  I . 

—  FIrat  back  out  octat  containing  old  flaga  froa  tha  chackaua. 
cha ck Bua_u I th_op t I ona . 1 0  i=  charkaua.u I th_opt I ona. I o 

zor  haadar_octa t.array (6) I 
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htadar_bu  f  f  ar  .  f  I  ags  i=  0| 

—  Nou  updata  chocKau*  ulth 
chack  aun.u  I  t  h_op  t  I  on  a .  I  o  i  = 

end  ifi 


oetat  containing  nau  flaga  valua. 
chackaun_u lth_optlona.  lo 
zor  haadar_ac ta t_array (6)  I 


--  Inaart  a  nau  fragnant  offiat  Into  tha  haadar  and  alao  adjuat  ehackaua. 

--  FIrat  back  out  octats  containing  fragaant  offaat  fron  tha  chackaua. 
chackauiii_u  I  th_opt  Iona  i=  chacka  uB_u  I  t  h_op  t  I  ona 

zor  Convar t_tuoioaa_array _to_racord ( 
haadar_oc ta t.array (6  ..  7)  ), 

haadar_bu f f ar . f ragman t _o f f aa t  t=  curran t _f ra gaan t _o f f aa t | 

--  Nou  updata  chackaua  flald  In  haadar_buf f ar  ulth  octata  updatad  for 
--  nau  fragaant  offsat. 

haadar_bu f f ar . haadar_chack aua  i=  chackaua_u I th_op t I ona 

zor  Convar t_tuoaoaa_array_to_racord( 
haadar_octat_array (6  ..  7)  )| 


S  and_f r agaan  t ( 

data_fragaant_a  Iza  =>  aagaan  t  _l  ang  t  h , 

aucca  aa  f  u  I  _f  ra  gaan  t  _t  r  anaa  I  aa  I  on  =>  ok  _f  r  a  gaan  t  _t  r  anaa  I  sa  I  on , 

axplanatlon  =>  I  aa  t_raau  1 1 ) ; 


--  Sat  up  parametara  for  tha  next  tiaa  through  tha  loop, 
if  cur  ran t _f ra gaan t _o f f aa t  =  0  then 
■acond_f rag aan t  t=  truaj 
end  if| 

c urran t_f r agaan t_o f f aa t  i= 

fragaant _aagBant_langth 
■I-  currant_f  ragaan  t_of  f  aat  I 

if  not  (cur ren t _fra gaan t_o f f aa t  <  aagaant_l ang th>  then 
aor a_f r agaa n t a  i=  falaaj 
end  ifi 


end  loopi 


End  "aultlpla  packat*  caaa. 
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““  o-S I  I  I  con  Projact 

““  Un  I  van  I  ty  of  Utahs 

OoD  Intarna<  Protocol  INn_0UT  lubnodula 

flda  coda  for  tha  body  of  tha  procaduras 

““  Read- m_ header  (cal  lad  by  Do.iind) 

Varilon  of  Novaebar  1,  1982 

■eparate  (I na.Ou t_nodu I  a. Do_Band 5 

procedure  Raad_ln_ha»dar 

(good_haadar :  out  boolean) 


ia 


Func  1 1  on  s 

Thli  procadura  firit  raada  In  tha  local  na <  addraii  of  tha 
tha  datayraa  Into  a  local  nat  addraii  buffer  and  than  raada  In  tha 
datagraa  header  octet  by  octet  Into  a  haadar  buffer.  Upon 
tuccaiifully  conplating  tha  tranifar  of  tha  haadar  tha  flag 
good_haadar  la  aat  to  true  otharulaa  It  la  aat  to  falaa 
In  the  couraa  of  reading  In  tha  haadar,  It  nakaa  a  pair  of  antru  calla 
tr.na  at..toa_taalc  to  obtain  tha  local  n.t  typ.  of  a"arv  I  ca ,  ?  f  SnS 
and  alao  coeputaa  tha  chadcauaa  (one  ulthout  and  one  ulth  tha  optlona 

d^clarln  r.  In  tuo-octat  racorda 

declared  (and  cleared)  In  Do_aand  and  naead  chtckaua  and 

check  Bua_u I t  h_op  tlona,  raapactivaly. 


—  Conatantas 


conartant  Integer 
constant  :  =  24  8) 
constant  >  =  224 1 

constant  t  =  31) 

constant  :  =  1 5 1 

h  lgh_octat_byta  j  constant  >  =  8; 

IOH_octat_byta  i  constant  «  =  1  j 

“  Rccaaaad  globalai 


■  I  n  I  laun_haadar_l  ang  th: 
h I 9h_*_b I ta  j 

h I gh_3_b I ta  i 

loM_5_blta  . 

loH_4_blta  , 


28; 

Upper  4-b  I  t 
Upper  3-b  I  t 
Lou  5-b  I  t 
Lou  4-blt 


■aak  for  an  octet, 
■aak  for  an  octet, 
naak  for  an  octet, 
■aak  for  an  octet. 


High  byte  of  tMO_oc  ta '._buf  far. 
Lou  byte  of  t ho_oc t a t _bu f f ar . 


chackaua:  tMo_oc ta t_racord; 
--  chackaua_u I t h_op t I  one >  tuo_oc ta t_raeord| 
--  I  oca  I _na t _t oa_ I n dax :  Integer  range  1  . 


--  Declared  In  Do_aand; 
■ax_toa_tab la_a I za; 


-  Local  variable  dec  I ara 1 1 ona s 


«  octat_iypa; 

t“o_octata  j  octat_buf  far_typa  (8  ..  Dj 

—  Ranaaad  procaduraa  and  functional 


procedure  naBory_raquaa t  ( 

raquaat_typa_foraali  ■•Bory_raquaa t_t ypa ; 

chunk_of_addraaa_forBa 1 1  chunk_o f _addraBa_typa ; 

ontat-f oraa I  I  out  oc ta t_typa) 

renamea  naaory.Raquatt; 


function  Hatk  ( 

nuBbar_to_ba_BaBkad_f oroa I t  Integer; 

„  Integer)  retuni  Integer 

renames  InB_In_Out_Da fa. naak ; 

—  Local  function  definition! 
function  Evan  ( 
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oparandi  Intagar) 
return  boo  I  aan 

is 

begin 

if  oparand  rem  2  =8  then 

return  trua) 
else 

return  fa  i  aa ) 
end  if| 
end  Evan  I 


begin 

good_haadar  i=  truj) 

—  Cat  tha  local  nat  addraaa.  By  convantlon,  thia  fiald  aluaya  pracadaa 

—  tha  actual  datagras  to  ba  aant. 

for  Indax  in  8  . .  I nn_addraaa_langth  -  1 
loop 

Hasory.raquaa  t  ( 

raquaat_typa_f  orsa  I  =>  ra ca  I  va_da  t ub_oc  l  a t , 

chunlt_of_addpaaa_forsa  I  =>  don  t _car a_X_da  t us , 
octat_fopBal  =>  Ins.addpsaa.buf  fap  (  Indax) )  I 

end  loop I 

--  Cat  tha  haadap’t  vapslon  nusbap  and  langth. 

Hasopy.paquaat ( 

paquaa  t  _t  yp  a_f  OP  Ba  I  =>  paca  I  va_da  tuB_oc  t  a  t , 

chunk.o f_addpaaa_f opBa I  =>  don t_capa_X_da tuB , 
octat.fopsal  =>  octat)) 

haa dap_bu f f ap . vap a  I  0 n  1=  saak 

(nuBbap_*o_ba_BaBlcad_fopBal 
Baak.f ci'Ba  I 

haadap_buf lap. IHL  t-  Baak 

(nuinbap_to_ba_Baalcad_foPBal 
Baak.f opsa I 

--  Chack  tha  haadap  vapslon  nusbap. 
if  not  (hsadsp_bu  f  fap  .  vsps  I  on  =:  4)  then 

good_hsadsp  1=  falsa) 

return; 

elsif  hsadap_bu  f  f  ap .  I HL  a  4  <  b  I  n  I  BuB_hsadsp_l  sng  t  h  then 

good_haadsp  :=  falsa) 
return; 
end  if) 

—  Updata  octstt  of  tha  two  chacksusa. 

chacksuB.lo  :=  octat  zor  chseksuB. I  0 ) 

chsckauB_H I th_op t  I  ons.  I  0  1=  octat  zor  chacksuB. I  0 ) 

--  Cat  tha  typa  of  tapvics  octat. 
nasopy.psquss  t ( 

psqusBt_typa_f  oPBa  I  =>  psca  I  va_da  t ub_oc  ts  t , 

chunk.o  f_addpats_f  opBa  I  =>  don t  _capa_X_da  t ub , 
octat_fopBal  =>  hsadap_bu  f  I  ap .  typa_o  f  _tap  V I  ca )  ) 

—  Ua  Saks  tha  flpst  sntpy  call  on  t pans  I  a ts_tos_task . 

Tpa  ns  I  a  t a_T0S_Task .Bsgln.tpanalat Ion (haadap _buffap.typa_of_sapvica)) 


=  >  octat, 

=  >  loH_4_blta>) 

=  >  octat, 

=  >  h I gh_4_b  Its)) 


—  Cat  tha  total  langth  half  MOPd  (2  octata). 

for  I ndsx  in  8. .  1 

loop 

naBOPy_paqus8  t ( 

psquaat_typs_f  OPBa  I  =>  psca  I  va_da  t  ub_oc  ta  t , 
chunk_of_addpass_f  OPBa  I  =>  don  t _capa_X_da  t  ub, 
octat.fopsal  =>  t  uo_oc  t  a  t  a  ( I  ndax) ) ) 
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end  loopi 

h««d«p_buf ftr . t ott I _l ►ngth  «=  Con vt r t _t wot 0Bt_«rp»y_t o_r«c opd 

( t  HO_OC  t • t ■ (6 . . 1) ) ( 

—  Coapult  tht  ttgntnt’t  Itngth  in  octits. 

•■gaint_lingth 

j=  Con vor t _t Hot omo_«pra y_t o_l n 1 1 gt p ( t HO_oc 1 1 1 1 (8  .  .  1)  )  -  htadt p_l tng t h j 

—  Updatt  tha  tHO  chacicauni. 

chiclciua  1=  chiciciun  zor 

ConvtPt_tHotoBt_appay_to_Ptcopd(tHO_octttt(8..1))| 

chacicium.u  I  th_opt  I  ona  j=  chacic  tuB_H  i  t  h_op  t  I  ont  zor 

ConvtPt_tHOtoBn_appay_to_pacopd(tHO_octatt(8..1))| 

—  Gat  thi  Idant i f I ca t Ion  half  uopd  (2  octiti). 
for  I  ndax  in  8.  .  1 

loop 

ncBOpy.Piquai t ( 

piquBB  t  _t  y  pB_f  opaa  I  =>  p  t  c  1 1  va_da  t  ub_oc  1 1 1 , 

chunk_nf  _addpttt_f  opaa  i  =>  don  t _capa_X_da  t ub, 
octat_fopBal  =>  tHo_oc  ft  ft  ( Indax)  )  I 

end  loop) 

htadap_buf  ftp. I  dan t I f I ca t i on  : =  C on vt p t_t HOtoBa_apray _t o_p tcopd ( 

tHO_octatt(8..1))) 

—  Updata  tht  tuo  chtcktuat. 

chacktuB  «=  chtckauB  zor 

ConvtPt_tHoaoBt_appay_to_Ptcopd(tHO_octttt(8..1))) 
chaclctuB_Hi  th_opt  iont  :=  chac  Ic  t  ub_h  I  t  h_op  t  I  ont  zor 

ConvtPt_tHOtoBt_appay_to_ptcopd(tHO_octttt(8...1))) 


--  Gtt  tha  flagt  (3  bita) 

for  1 ndax  in  8. . 1 

loop 

■  r.d 

tht 

fpagatnt  offtat  (13  bita). 

nuaopy_Ptqutt  t ( 

ptquttt_typt_fopBal 

=  > 

ptctivt.da  tuB_oc  tt  t , 

chunk_of_addpttt_f opaa i 

=  > 

don  t_capt_X_da  tua, 

octtt_foPBa 1 

=  > 

tuo_octtti ( Indtx) ) ) 

end  loop) 

haa  dt p_bu f  f  tp.  f  1  agt  t  = 

aatk 

<nuBbtp_to_ba_Batktd_fopBa  1  = 

tuo_octatt(hl gh_oc  1 1 1  _by  1 1 ) , 
Batlc_fopBal  =>  h  I  gh_3_b  I  1 1 )  j 


htadtP_bufftp.fprgMtnt_offttt  «= 

aatlc  ( 

nuBbtp_tr,,_bt_Batictd_f  opaa  I  =>  tHo_octttt  (hlgh_octtt_byta) , 
Batlc_fotaal  =>  l0H_5_bltt) 

a  thiftS  +  tHO_oc tt tt ( I OH_nc ta t_by tt ) ) 

—  Updatt  tht  tuo  chtcktuBt. 

chtcktuB  i  =  chack.tuB  zor 

Convtpt_tHOtoBt_appay_to_ptcopd ( tHO_oc  tt  tt (8. . 1) ) ; 
cht ck tuB_H I t h_op t  I  ont  «=  chacktuB_Hlth_optlont  zor 

Convapt_tHotoBa_appay_to_ptcopd(tHO_octatt(8..1))) 

--  Gat  tha  tiBt-to-llvt  octat. 
ntBOPy_Ptquta t ( 

Ptquaa  t  _t  ypa_f  OP  aa  I  =>  pt  c  t  I  vt  _da  t  UB_oc  1 1 1 , 

chunk_of_addpttt_f  opaa  I  =>  dont_capa_X_datuB, 
octtt_fopBal  =>  htadap_buf  ftp.  t  lBa_to_l  Iva) ) 

—  Updatt  tht  tuo  chtcktuat. 

chtcktuB.lo  <=  chacktuB.lo  zor 

haadtp_bufftP.tlBt_to_l Iva; 

chtcktuB_Hl th_opt Iont. lo  «=  c hack tuB_H  1 1 h_op 1 1 ont . I o  zor 

htadtP_bufftp.tlBa_to_l Ivt) 

—  Gtt  tht  ppotocol  octat. 
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n«Boru_P«quii t ( 

p«qu«i  t_typ«_fopBa  I  =>  paca  I  va_da  t  ub_oc  t  •  t , 

chunk  of  addPoia_fopBa  I  =>  dont_capo_X_d«tuB, 
octatlfopBol  =  >  haadtP.buffap.ppotocoDi 

_  Updata  tha  tuo  chackiuai. 

chackiua.hl  <=  chackiua.hl  xor 

haadap.bullap.ppotocol; 

chacksuB  Hith.optiona.hl  i=  chack 8ub_m i t h_op t I oni . h I  xor 

haadap„bullap.ppotocol| 

--  Cat  tha  haadap  chackium  hall  uopd  (2  octati)  and  duap  It  on  tha  fioo.' 
--  It’i  not  naadad. 
for  I  n  d  a  X  in  8  •  •  1 
loop 

naBOPy.Paquaa  t ( 

paqu#Bt_typa_f0PBa  I  =>  p  a  ca  i  v  a  _da  t  UB_oc  t  a  t  , 

chunk  of  addpaii  lopBal  =>  don  t  _ca  pa_X_da  t  ub, 
octatIfOPBal  =>  tMO_octati(lndax))| 

end  loop; 

—  Cat  tht  loupca  and  daitinatlon  addpaisai  and  tha  pait  of  tha 

—  haadap  buffap  which  coniiati  of  tha  option  octati.  Fop  all  octati 

—  pait  thi  twit, tilth,  updati  only  ona  chackiuB.  Nota;  no  convapalon 

—  poutina  la  naadad  hapa. 
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for  Indax  in  12  ..  haadap.l anyth  - 
loop 

flaBopy.paquaat  ( 

p  aqua  a  t  _t  y  p8_f  opaa  I  => 

chunk_of  _addpaaa_f  oPBa  I  => 
0Ct8t_f0PBal 


1 


pact  I  vt  .dt  t uu._oc  t •  1 1 
don  t_car«..X_d«tu», 

haadap.buf  f  ap. octa  t _buf lap ( Indax )) ; 


if 


Evan  (Indax)  and  then 
phackauB. I o 

ch8ckauB_Hlth_optlona.  to 


Indax  <  26  then 
t=  chackauB.lo  xor 

haadap _b uf lap. oc tat _buf lap  (Indax) I 
:=  chackauB_Hlth_optlona.lo  xor 

haadap_buf  f IP. octa  t_buf  f IP  ( I ndax) I 


eiaof  Evan  (Indax)  and  then 

chackauB_Hlth_opllona.  lo 


Indax  >=  28  then 

1=  chack auB_H  I t h_op t I ona .  I  o  xor 

haad8P_buffaP. oc ta  t_buf  f an  I Indax)  ; 


elxif  not  Evan  (  Indax)  and  then  Indax  <  28  then 

chackauB.hl  J-  chactauB.hl  xor 

haadap_buffap.octat_buffap(lndax)| 

chackauB_Mlth_optlona.hl  .=  chacka"uB_H  I  t  h_op  t  I  ona  .  h  I  xor 

haadap  buffap. oc tat _buffap (Indax); 


else 


--  not 
chackauB 


Evandndax)  and  than  Indax  >=  28  than 

ulth  option!. hi  1=  chackauB_H I th.opt lona.h I  xor 

~  haada p_bu f f IP . oc ta t_bu f f ap ( I ndax ) ; 


end  if) 

_ Ua  aaka  tha  aacond  antpy  call  on  Tpara I  a ta_TOS_Taak . 

Tpana la t a_T0S_Taak . Sand_paau I t ( 

auccaBifut  tpanalatlon  — ^  good  _h  a  a  d  a  p , 

toi  Indax  "  =>  I  o  ca  I  _na  t  _t  oa_l  ndax )  ; 

~  --  Cood.haadap  la  aat  falaa  If 

_ tpanalatlon  la  unauccaaa f u  I , 

_  In  uhlch  caaa  tha  valua  obtained 

—  fop  loca l_nat_toa_lndax  la 

—  u  I  I  I  ba  Iqnopad. 

end  loop; 


end  Raad.l n.haadap; 


C.  1-  u 
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—  Rda-t o-S I i I  con  Ppojact 

--  Univarcity  of  Utah: 

--  DoO  Intarnat  Protocol  INn_0UT  aubaodula 

(Ida  coda  for  tha  body  of  thu  procadurat 

— -  Compact— options  (cal  lad  by  Do_aand) 

--  Varalon  of  Novasbar  1,  1962 

separalbC  Inin_Out_nodu  la.  Oo_aa''d) 


procedure  Conpact_opt Iona 

—  Function: 

Thia  procadura  la  Invokad  uhan  conatructiny  tha  aacond  fraysant 

—  (and  only  tha  aacond  fraysant)  of  a  datayras. 

--  Tha  procadura  cospacta  tha  Mat  of  optlona  In  tha  haadar  by  kaaplny 
--  only  thoaa  optlona  that  ara  flayyad  to  ba  coplad.  Tha  haadar  lanyth 
and  total  lanyth  ara  alao  updatai  aa  us  I  I  aa  tha  chacIcauB.  Tha 
--  valus  of  chaclcauB_u  I  th_op  1 1  oni  la  rscosputsd  froB  from  tha  valua  of 
chscIcauB 
is 

—  -  fjcaaaad  ylobala. 


--  chscKauB:  t uo_oc t a t _r acor d ; 

--  chsc  Ic  ■uB_u  I  t  h_op  t  I  ona  :  t  mo_oc  t  a  t  _r  aco  r  d  j 

—  Subtyps  dsclaratlon: 


subtype  Indax6_typa  is  Intayar  range  8  ..  2  aa  6  -  1) 

--  Bacauaa  sax  haadar  alzs  4-  64  octata. 

--  Constanta: 


op  t  I  on_a  f  f  ast :  constant  I  n  t  ay  ar  :=  28; 

—  Of  fast  (In  octa  ta) 

--  Indicatiny  uhars  tha 
--  optlona  Nat  bayina. 

haadsr_lsnyth_ulth_no_optlona:  constant  Intayar  :=  28; 
copy_op  t  I  on_trua:  constant  Intayar  :=  1; 

--  Flay  valua  Indicatiny 
--  that  tha  currant  option  la 
—  ba  coplad  to  all  frayaanta. 

—  Local  variable  dac  I  ar a 1 1 ona: 


nsu _hsa dar_ I  any t h :  Indax6_typa; 

op t I ons_l any th:  Indsx6_typa; 

currant _opt lon_lanyth:  Indax6_typa| 

I aad I ny  .cursor:  Indax6_typa; 

t ra I  I  I ny.curaor :  Indsx6_typa; 


nuBbar.of.pad.octsta:  Intayar  rsnge  8  ..  3; 
begin 

—  Doaa  this  header  has  any  optlona? 
if  h aadar _l  any t h  <=  h aadar.l any t h_H I t h_no _op t I  one  then 

return;  --  Thera  ara  no  options  to 

end  if;  —  to  "coapact*. 

--  Initialize  variables. 

op t I ona_l  anyth  :=  h a adsr _ I  any t h  -  haa dar_ I  any t h_u 1 1 h_no_op t I ona ; 


—  In  octata. 

--  Lanyth  of  optlona  ((at. 
--  Lanyth  of  a  candidatn 
--  option. 

--  Indicates  next  option 
--  conaldsrad  for  copyiny. 

—  Indicates  slot  In  haadar 
--  to  racalva  tha  next 

--  coplad  option. 


V  I  u 
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I  ••(!  I  ng_cur  tor  i=  8| 
trailing .cursor  i=  8) 


--  Initlallza  chock tun.u I t h_op t I ont  fion  chacktum. 
chtcktum.u I th.opt Ions  ;=  chtektUB) 


—  Bagin  compacting  flaggad  options. 


while  I  aad  I  ng.curtnr  <  opt  lons.langth 


--  Ua  uta  <  rathar  than  <= 
--  to  avoid  scanning  tha 
--  tarmlnal  octat,  uhleh  tiust 
--  and  *t nd-o f-op t I ons - 1  I s t * 
--  octst. 


loop 

--  Is  this  option  raprasantad  as  a  singla  or  nultlpls  octat? 

--  Discrimlnata  by  axaalnlng  tha  option’s  numbar. 
if  ha  a dar.bu f f ar . oc ta t _bu f f ar ( 

op  t  I  on_of  f  sst  4-  load  Ing.cursor)  rem  shIftS  <  2  then 

cur  r  an  t  _op  t  I  on_l  ang  t  h  i=  Ij 
else 


--  Gat  tha  naxt  option  octat.  It  contains  tha  option  langth  as  Its 
--  va I u a . 

currant.opt  lon_l angth  haada r_bu f f ar . oc t a t .but f ar  ( 

op  t  I  on_o  f  f  s  a  t  4-  1  -f  I  aad  I  ng.cur  sor )  | 

end  if) 

--  Datarmlna  uhsthar  or  not  this  option  should  ba  coplad. 
if  Shift _rlght( 

has  dar.bu  f  f  ar .  oc  t  a  t  _bu  f  f  ar  (op  t  I  0  n_o  f  f  sa  t  4-  I  a  ad  I  n  g_cur  sor  > ,  7) 

=  copy _op t I on_t rua  then  > 

for  copy.Indax  in  8  ..  curran t _op t I  on _ I ang t h  -  1 

loop 

haadar.buffar. octat _buffar(optlon_offsat  4- 
tra  I  I  I  ng.cursor  4-  copy_lndax> 

:=  ha  adar  _bu  f  f  ar  .  o  c  t  a  t  _bu  f  far  (op  t  I  on_o  f  f  sa  t  4- 
I  sad  Ing.cursor  4-  copy_lndax>) 

—  Updata  chsc ksum.u I t h_op t I ons .  Toggla  on  odd-  and  avan-valuad 

—  bytas  In  conptetad  options  flald. 

if  ( t  ra  I  I  I  ng.cursor  4-  copy_lndax>  mod  2  =  8  then 

chacksun.H I th_op t I ons . I o  t=  chack %um_u I th_op 1 1 ons .  I o  zor 

haadar.buffar. octat _buffar ( 
opt  I  on_o  f  f  sa  t  4- 

t r a  I  I  I ng.cursor  4-  copy.lndax); 

else 

chackoun.u  1 1  h_op  1 1  ons  .  h  I  s  ch  ack  su  a_u  I  t  h_op  1 1  ons .  h  I  zor 

haadar.buffsr. octat _buffar ( 
optlon.offsat  4- 
tra  I  I  I  ng.cursor  4-  copy.lndax); 

end  if; 
end  loop; 

--  Updata  tha  trailing .cursor. 

tra  I  I  Ing.cursor  i=  t  ra  I  I  I  n  g.cur  sor  4-  cur  r  a  n  t  _op  1 1  on.l  ang  t  h ) 
end  if) 

--  Updata  tha  I aad I ng.cur sor . 

I  sad  I  n  g.cur  s  or  i=  I  sad  I  ng. cur  so  r  4-  c  ur  r  a  n  t  _op  1 1  on.l  an  g  t  h ; 
end  loop) 

--  Pad  out  tha  last  option  Hord  ulth  pad  octats  (Including  thu  last  ona, 

--  uhleh  Is  an  snd-o f-a I  I -op t  I  on s  octat)  until  us  havs  rsachsd  a  32-b I t 
»  boundary. 

numbar.of.pad.octats  i=  4  -  ( t ra I  I  I ng.cur s or  mod  4); 

for  copy.Indax  in  8  . .  numbar.of.pad.octats  -  2 

loop 

--  Insart  a  "pad*  octst  (=  *88888881"). 

haadar  buffar. octat  buf  f  ar  (op  1 1  on  offsat  4-  tra  I  I  I  ng.cursor  4-  copy.Indax) 
'=  1) 
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--  Updata  chaclc  I  th_op  t  I  ona  Hith  pad  octat  (=  “80000001.  ) 

chacltauB_H  I  t  h_o  p  t  I  ona  i  =-•  chac  It  auii_H  1 1  h_op  t  I  ona  zorl; 
end  loop; 


--  Nou  Inaapl  lha  laat  pad  octal.  . 

--  Inaapt  an  "and  j f-a I  I -op  1  I ona ’  octat  (=  00000080  )• 

--  Nota  that  tha  zapo  valua  of  tha  and-o f-a ! I -o p t I ona  octat 
—  uMI  not  changa  tha  valua  of  lha  cuppant  chackauB;  hanca  thapa  la 
--  no  updata  of  tha  chackauB  fop  thia  octat. 
haa dap_bu f f ap . 0 c t a t  _buf f ap ( 

opt lon_of faat  +  t pa  I  I  I ng_cupao p  +  nuBbap_o f _p ad_oc t a t a  -  1) 

I  =  0; 


naH_haadap_ I ang t h 

1=  op t I on_o f f aa t  +  t pa  I  I  I ng_cupaop  +  nuBbap _o f _pad_oc t a t a ; 


--  Updata  tha  total  langth  flald  and  tha  ohackauB. 

--  Flpat  back  out  octata  containing  total_langth  fpom  tha  chackauB. 
chackauB  Hlth_optlona  i=  chac k aum_H I t h_op t I ona  xor 

Convap t_tHoaoBa_appay_t o_pacopd ( 
haadap.octa t_appay (2  ..  3)); 


haadap_buf f ap,  tola  I _langth  :  = 


Convapt_lntagap_to_tHo_octat_pacoPd( 

ConvaPt_tuo_octat_pacopd_to_lntagap( 
haadap_buf  fap. tola l_langth) 

-  haadap_l angth  -  naH_haadap_l anyth) ; 


_  Nou  updata  chackaum  with  octata  containing  nau  total_langth. 

chackauB_Hl th_opt Iona  :=  chac k auB_H I t h_op t I ona  xor 

Convapt_tHoaoBa_appay_to_pacoPd( 
haadap_octa t_appay (2  ..  3)); 


_  Updata  tha  IHL  flald  and  tha  chackauB. 

_  Back  out  octat  containing  old  IHL  valua  fpoB  tha  chackauB. 

chackauB_Hl th_opt Iona. lo  i=  c hack auB_H I t h_op t I ona .  I  o 
~  xor  haadap_octat_appay  (0) ; 

haadap_buf  f  ar.  IHL  i=  Sh  I  f  t  _p  I  g  h  t  (naH_h».adap_l  a  ng  t  h  ,  4); 

--  Updata  chackauB  with  octat  containing  nau  IHL  valua. 
chackauB  ulth  optlona.Io  i=  c hack auB_H I t h_op t I o na .  I  o 

xor  haadap_oct't_appay(0)i 


end  CoBpact_optlona; 


a  1. 1) 
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””  Rda-to-S I  I  I  con  Projoct 

Univorilty  of  Utahi 

OoO  Intirnit  Protocol  IN?1_0UT  lubaoduli 
"  Rdi  coda  for  tha  body  of  thi  procidurai 

"  Send- fragment  (ci  I  lad  by  Do_iand) 

Version  of  Novisbar  1,  1982 
with  F I fo-flodu I  a ,  Loca l_Nit_nodu 111 

separatee  InB_0ut -Ho  du  Is.  Do_iand) 


procedure  Sand_f ragaant ( 
data_fragsint_i Izii 

lucciiiful-fragaint-tranialiiloni 

explanation) 


b  1 1 16 1 
out  b  00  I  s  an  I 
out  out_raiponia) 


2) 

3) 

O 


?h  1  procedure  put.  Into  the  local  n.t  FIFO  the  - 

n  loc^rnet  addra..  -  local  nat  addr...  for  th. 

local  not  TOS  -  local  net  TCS  for  the  current  frags.nt 

YMlWi  d”2*-  uhlch  I.  pulled  out  byte  by  byte  fro.  the  fle.ory 
associated  with  the  INH-SRV  .oduls.  the  alzs  of 
the  d^I  fraga.nt  I.  passed  a.  a  para.et.r  to  this  procedure. 

ThI.  procedure,  after  stuffing  th.  FIFO,  uMI  do  a  tl.ed  •'"•'y 
on  thS  local  n.t  (th.  call  .ust  be  co.pl. ted  In  th.  t I ..  spec  I f I ed 
by  a  para.it.r  pa.t.d  doun  fro.  INR-SRV)  .  Upon 
0. ... ........ .. 

auccsssful  fragmsnt-trans.lsslon  flag  ulll  be  aa  t 

*ril".t  7o  f^...  Th.  value  assigned  to  -.xplana.  on“  conflr.s 
the  success  (sant_olt)  or  providss  ths  rsason  for  failure. 


—  Rsna.sd  tasZ  entries) 


procedure  ns.ory_requsst ( 
rsqusst_typs_for.al  : 
chunlt_of_address_for.al) 

octet_f or..  I  ) 
renames  Hsaory.Rsqussti 


.s.ory_requsst_typs| 

chunl£_of-addrsss-type| 

out  oc tit -type) 


procedure  Loca l_net_out_req ( 

CO. .and  for.al)  I o ca I _n. t -C oB.and_t ype ; 

rasponss  formal)  out  I  oca  I _ns t _r esponsa_t y pe ) 
renames  Loca  I -N. t -Hodu I s . Loca I _N. t . Ou t-raqi 


procedure  F I f 0 _r a q  ( 

coBBand-for.a I )  f I f o_co.Bard-type| 
octst_f or.a  I )  octst_type) 
renames  F  I  f  o_nodu  I  s.  F  I  f  o  .  F  I  f  o-req  | 


_  Local  variable  daclarJt Ions) 


octsf_rsglstar) 

local_nat_rasponse) 


octat-typet 

local_net_respon5a-typs| 


^*auccass f u l_f rag.ent-trana. las  I  on  t=  truai 


__  Reinitialize  ths  FIFO . 


LIS 
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F  M  0  _p  •  q  ( 

coiiiiii«nd_<opiii«  I  =>  paaat, 
octat_»op«al  =>  dont_capa_oc  ta  t )  ; 


_ Load  tha  FIFO  m  I  th  tha  fpagmant'a  local  nat  addpaaa  ppcvloualy 

_ savad  In  tha  I  p.B_addpaaa_buHap. 

for  Indax  in  8  .  •  I  n»_addpasa_l a ng th  -  1 
loop 

F I fo_paq ( 

comnand_f  opna  I  =>  atopa, 

octat_toPiiia  I  =>  lniii_addpaaa_bu»  tap  (  Indax)  )  ; 

end  loop; 

--  Load  tha  FIFO  ulth  tha 
for  Indax  in  local _nat_toa 

loop 

F I to_paq ( 

coinaand_f  opaa  I  => 
octat_topBa I  =  > 

end  loop; 

--  Load  tha  tpagaant’a  haadap  Into  tha  FIFO. 

for  Indax  in  8  .  .  haadar_langth  -  1 

loop 

F I to_raq ( 

comBar.d_f  opna  I  =>  atopa, 

octat_top«al  =>  haadap_oeta  t_appay  (  Indax) )  ; 

end  loop; 

—  Gat  the  data  tpagmant  Ipob  tha  aanopy  and  load  It  Into  tha  FIFO 
for  data_lndax  in  8  . .  aag«ant_l angth  -  1 

loop 

tlaaopy.raquaa  t  ( 

p  aquaa  t_ty  pa_t  OPBa  I  =>  paca  I  va_da  tu«_oc  ta  t , 

chunk  oT  addpaaa_top»a  I  =>  don  t _ca pa_X_da t  ua,  ^ 

octatltoTmal  =  >  oc t a t _p ag I # t or ) ; 

F i to_paq ( 

c 0Baand_t 0 p aa  I  =>  atopt, 
octtt_fopaal  =  >  octat_pagltttp) ; 
end  loop; 


I  oca  I  nat  too . 

_ I nda X  . . 

local _ntt_typa_of_ttPvlct_tablt_poM_al*a 


atopa, 

toa_t  ab I  a ( I ndax)  )  ; 


—  Do  a  tiaad  antpy  call  on  tha 

—  tha  FIFO  hat  a  tpagaant  ulth 
■elect 

Loci  I _na t _ou t _p tq ( 


local  nat  Indicating  that 
local  nat  intopaaticn  In  It. 

Cond I t I ona I  ta  lact. 

--  Uaa  (pagaant  pacalvad? 


coaaand.topaa  I  =>  ptet  I  vt_t  p  agat  n  t , 
ptapontt_topaa  I  =>  I  oca  I  _na  t_p  otpontt )  ; 


or 

delay  t l■t_out_ln_al  I  I  latcondt; 


--  Valua  uaa  coaputad  by 
—  Rtad_l  n  I  t_papaattiipa 


--  Tha  local  nat  pandazoua  hat  tiaad  out. 
axplanatlon  1=  I  oca,  I  _nt  t_t  I  aa_o  u  t ; 
auccaatful_lpagaant_tpanaalatlon  i =  lalaa; 


end  select; 


_  Taat  to  aaa  It  tha  local  nat  pacalvad  tha  tpagaant. 

if  auccaaatui  fp agat n t _t panaa I  at  I  on  —  Local  not  did  not  tiat  out. 

and  then  ‘Shopt-c  IpcuI  f  phraaa. 

not  (local  not  paaponaa  =  (pagaan  t  _pt  ct  I  vt  d_olc )  then 

~  ___  Iamai  Koiiathlna  yis  urono* 


axplanatlon  :=  I  oca  I _nt t_tPPOP ; 
auccaaa«ul_«pagBant_tpanaalaalon  i=  talaa; 

end  if; 


O  1.  o 
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